Some checks failed
Release Gate / gate (push) Has been cancelled
- user_ask 콜백을 별도 팝업 대신 본문 inline 카드 경로로 변경 - 선택지 pill, 직접 입력, 전달/취소 버튼을 timeline 안에서 처리 - 계획 승인과 질문 요청이 같은 transcript-first UX 원칙을 따르도록 정리 - claw-code parity 문서와 개발 이력 문서에 질문/승인 UX 기준을 반영 검증 결과 - dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\verify\ -p:IntermediateOutputPath=obj\verify\ - 경고 0 / 오류 0
278 lines
16 KiB
Markdown
278 lines
16 KiB
Markdown
# Claw Code Parity Plan (Rewritten)
|
|
|
|
## Scope
|
|
- Align AX Copilot with claw-code quality for loop reliability, permission/hook behavior, and session durability.
|
|
|
|
## Update
|
|
- Updated: 2026-04-05 15:34 (KST)
|
|
- Rebased the AX Agent improvement plan on actual `claw-code` runtime files instead of earlier AX snapshots. The reference spine is now `src/bootstrap/state.ts -> src/bridge/initReplBridge.ts -> src/bridge/sessionRunner.ts -> src/screens/REPL.tsx -> src/components/Messages.tsx -> src/components/StatusLine.tsx`.
|
|
- AX Agent work should follow that same quality order: state first, execution second, render last. UI-only fixes that bypass state/execution should be treated as temporary.
|
|
- Updated: 2026-04-05 16:55 (KST)
|
|
- Current estimated parity vs `claw-code`: core execution engine `82%`, main chat UI `68%`, Cowork/Code status UX `63%`, internal settings linkage `88%`, overall AX Agent `74%`.
|
|
- Engine-affecting settings should be handled conservatively during parity work. If a setting changes the main execution route, approval flow, or recovery behavior without representing a stable real-world user choice, it should be moved to developer-only UI or removed from user-facing surfaces.
|
|
|
|
## Preserved History (Summary)
|
|
- Core loop guards and post-tool verification gates are already partially implemented.
|
|
- Plan Mode, parallel tool execution, and unknown-tool recovery are in place.
|
|
- Session restore hardening is ongoing.
|
|
|
|
## Reference Map
|
|
|
|
| claw-code reference | AX apply target | completion criteria | quality criteria |
|
|
|---|---|---|---|
|
|
| `src/bootstrap/state.ts` | `src/AxCopilot/Views/ChatWindow.xaml.cs`, `src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs`, `src/AxCopilot/Services/ChatStorageService.cs` | one canonical runtime/session state for current turn, queue, retry, execution events, and persisted snapshot | reopen/retry/queue flows do not create duplicate or blank assistant messages |
|
|
| `src/bridge/initReplBridge.ts` | `src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs`, `src/AxCopilot/Services/LlmService.cs` | send/regenerate/retry/queued follow-up/slash all enter through one prepared-execution path | same input under same settings takes same execution route regardless of entry point |
|
|
| `src/bridge/sessionRunner.ts` | `src/AxCopilot/Services/Agent/AgentLoopService.cs`, `src/AxCopilot/Services/Agent/AgentLoopTransitions.cs`, `src/AxCopilot/Services/Agent/AgentLoopTransitions.Execution.cs` | tool start/result/error/progress normalized once inside loop layer | Cowork/Code no longer flash repeated status strings or overshare debug payloads |
|
|
| `src/bridge/bridgeMessaging.ts` | `src/AxCopilot/Views/ChatWindow.xaml.cs`, `src/AxCopilot/Services/Agent/AgentLoopService.cs` | inbound execution events separated from display-only events before UI render | execution event replay does not duplicate visible timeline banners |
|
|
| `src/screens/REPL.tsx` | `src/AxCopilot/Views/ChatWindow.xaml`, `src/AxCopilot/Views/ChatWindow.xaml.cs` | screen state transitions, queue flow, retry flow, and composer state use shared runtime helpers | window resize, queue chaining, and retry feel stable instead of UI-patched |
|
|
| `src/components/Messages.tsx` | `src/AxCopilot/Views/ChatWindow.xaml.cs` | timeline derives from normalized conversation/session state only | no token-only completions, blank cards, or direct injected duplicates |
|
|
| `src/components/StatusLine.tsx` | `src/AxCopilot/Views/ChatWindow.xaml`, `src/AxCopilot/Views/ChatWindow.xaml.cs` | status strip computed from debounced runtime state, not multiple imperative refresh calls | metadata stays lightweight and does not overpower message timeline |
|
|
|
|
## AX Agent Improvement Phases
|
|
|
|
### Phase A. Runtime State Canonicalization
|
|
- Reference: `src/bootstrap/state.ts`
|
|
- AX apply location: `src/AxCopilot/Views/ChatWindow.xaml.cs`, `src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs`, `src/AxCopilot/Services/ChatStorageService.cs`
|
|
- Completion criteria:
|
|
- `Chat`, `Cowork`, `Code` all update one shared runtime/session state model.
|
|
- queue, retry, post-compaction, and execution-event state can be restored after reopen.
|
|
- Quality criteria:
|
|
- reopening a conversation reproduces the same visible timeline without extra assistant cards.
|
|
- queue and execution badges remain in sync with the stored conversation.
|
|
|
|
### Phase B. Prepared Execution Unification
|
|
- Reference: `src/bridge/initReplBridge.ts`
|
|
- AX apply location: `src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs`, `src/AxCopilot/Services/LlmService.cs`
|
|
- Completion criteria:
|
|
- prompt stack assembly, execution mode choice, and final assistant commit are engine-owned.
|
|
- send/regenerate/retry/queued follow-up/slash flows all call the same preparation API.
|
|
- Quality criteria:
|
|
- behavior is deterministic per tab/settings combination.
|
|
- UI stops building different prompt stacks for the same conversation state.
|
|
|
|
### Phase C. AgentLoop Event Normalization
|
|
- Reference: `src/bridge/sessionRunner.ts`, `src/bridge/bridgeMessaging.ts`
|
|
- AX apply location: `src/AxCopilot/Services/Agent/AgentLoopService.cs`, `src/AxCopilot/Services/Agent/AgentLoopTransitions.cs`, `src/AxCopilot/Services/Agent/AgentLoopTransitions.Execution.cs`
|
|
- Completion criteria:
|
|
- loop events are normalized into bounded activity/event records before UI consumption.
|
|
- permission requests, failure states, retries, and completion states use a stable event shape.
|
|
- Quality criteria:
|
|
- Cowork/Code no longer flash rapidly during long-running tool sequences.
|
|
- file path/debug detail remains collapsed by default.
|
|
|
|
### Phase D. Timeline Render Parity
|
|
- Reference: `src/screens/REPL.tsx`, `src/components/Messages.tsx`
|
|
- AX apply location: `src/AxCopilot/Views/ChatWindow.xaml`, `src/AxCopilot/Views/ChatWindow.xaml.cs`
|
|
- Completion criteria:
|
|
- assistant/user messages, execution logs, compact boundaries, and queue summaries are rendered from one derived timeline model.
|
|
- direct imperative bubble injection is removed from normal send/regenerate/retry flows.
|
|
- Quality criteria:
|
|
- no blank assistant cards.
|
|
- no token-only completion without visible content.
|
|
- no duplicate event banners after re-render.
|
|
|
|
### Phase E. Composer and Status Strip Simplification
|
|
- Reference: `src/screens/REPL.tsx`, `src/components/StatusLine.tsx`
|
|
- AX apply location: `src/AxCopilot/Views/ChatWindow.xaml`, `src/AxCopilot/Views/ChatWindow.xaml.cs`
|
|
- Completion criteria:
|
|
- composer height grows only on explicit line breaks.
|
|
- status strip, queue summary, and runtime activity all use debounced runtime updates.
|
|
- Chat/Cowork/Code share one responsive width calculation policy.
|
|
- Quality criteria:
|
|
- resizing feels natural.
|
|
- composer does not keep growing after send.
|
|
- metadata remains subordinate to the message timeline.
|
|
|
|
### Phase F. Recovery, Resume, and Verification
|
|
- Reference: `src/bootstrap/state.ts`, `src/bridge/sessionRunner.ts`, `src/screens/REPL.tsx`
|
|
- AX apply location: `src/AxCopilot/Views/ChatWindow.xaml.cs`, `src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs`, `src/AxCopilot/Services/ChatStorageService.cs`
|
|
- Completion criteria:
|
|
- reopen after interruption keeps queue, runtime summary, and latest visible assistant state consistent.
|
|
- retry-last and regenerate do not depend on mutating `InputBox.Text`.
|
|
- all three tabs pass reopen/retry/manual compact/manual stop/manual resume scenarios.
|
|
- Quality criteria:
|
|
- stored conversation and rendered conversation stay identical after restore.
|
|
- final reopened state matches the last completed runtime state.
|
|
|
|
## Execution Tracks
|
|
1. Hook contract parity
|
|
- Structured hook output support (`updatedInput`, `updatedPermissions`, `additionalContext`).
|
|
- Runtime gating through settings toggles.
|
|
|
|
2. Session/state parity
|
|
- Deterministic run resume rules.
|
|
- Stable jsonl event schema + replay compatibility.
|
|
|
|
3. Recovery parity
|
|
- Failure-type classification and standardized retry guidance.
|
|
- Reduced repeated wrong-tool loops.
|
|
|
|
4. Completion parity
|
|
- Evidence-based finalization criteria for code/document tasks.
|
|
|
|
## Done Criteria
|
|
- Internal parity scenarios pass target threshold.
|
|
- Resume/replay failures: zero.
|
|
- `dotnet build` warnings/errors: zero.
|
|
|
|
## Validation Matrix
|
|
- Build: `dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\\verify\\ -p:IntermediateOutputPath=obj\\verify\\`
|
|
- Manual scenario 1: Chat send -> answer visible -> retry -> regenerate -> reopen conversation
|
|
- Manual scenario 2: Cowork tool run -> progress summary -> completion -> queue next request -> reopen
|
|
- Manual scenario 3: Code task with execution log noise -> completion -> compact -> next turn -> reopen
|
|
- Manual scenario 4: AX Agent internal settings change -> immediate runtime reflection without layout regression
|
|
|
|
## Canonical Prompt Set
|
|
- Updated: 2026-04-05 22:04 (KST)
|
|
- The following prompt set should be used for AX vs `claw-code` parity checks. The goal is not byte-identical output, but equivalent execution route, approval behavior, and artifact/result quality.
|
|
|
|
1. Chat basic answer
|
|
- Prompt: `회의 일정 조정 메일을 정중한 한국어로 써줘`
|
|
- Apply to: `Chat`
|
|
- Verify: normal reply render, retry/regenerate stability, reopen durability
|
|
|
|
2. Chat long-form explanation
|
|
- Prompt: `RAG와 fine-tuning 차이를 실무 관점으로 7가지로 설명해줘`
|
|
- Apply to: `Chat`
|
|
- Verify: long response rendering, compaction follow-up continuity
|
|
|
|
3. Cowork document task
|
|
- Prompt: `신규 ERP 도입 제안서 초안을 작성해줘. 목적, 범위, 기대효과, 추진일정 포함`
|
|
- Apply to: `Cowork`
|
|
- Verify: topic/task preset routing, plan-first execution, actual document-oriented output path
|
|
|
|
4. Cowork data task
|
|
- Prompt: `매출 CSV를 분석해서 월별 추세와 이상치를 요약해줘`
|
|
- Apply to: `Cowork`
|
|
- Verify: data-analysis tool choice, reduced runtime noise, final summary quality
|
|
|
|
5. Code bug-fix task
|
|
- Prompt: `현재 프로젝트에서 설정 저장 버그 원인 찾고 수정해줘`
|
|
- Apply to: `Code`
|
|
- Verify: read/search/edit path, diff persistence, reopen consistency
|
|
|
|
6. Code build/test task
|
|
- Prompt: `빌드 오류를 재현하고 수정한 뒤 다시 빌드해줘`
|
|
- Apply to: `Code`
|
|
- Verify: build/test loop, failure retry, final completion message
|
|
|
|
7. Queued follow-up
|
|
- Prompt sequence:
|
|
- `이 창 레이아웃 문제 원인 찾아줘`
|
|
- `끝나면 README도 같이 갱신해줘`
|
|
- Apply to: `Cowork`, `Code`
|
|
- Verify: queue chaining, next-turn pickup without UI mutation
|
|
|
|
8. Post-compaction continuity
|
|
- Prompt: `지금까지 논의한 내용을 5줄로 이어서 정리하고 다음 작업 제안해줘`
|
|
- Apply to: `Chat`, `Cowork`, `Code`
|
|
- Verify: compact-after-next-turn continuity, no token-only completion
|
|
|
|
9. Permission approval
|
|
- Prompt: `이 파일을 수정해서 저장해줘`
|
|
- Apply to: `Code`
|
|
- Verify: permission request, approve/reject rendering, final transcript consistency
|
|
|
|
10. Slash / skill entry
|
|
- Prompt: `/bug-hunt src 폴더 잠재 버그 찾아줘`
|
|
- Apply to: `Code`
|
|
- Verify: slash entry uses the same prepared-execution route as normal send
|
|
|
|
## Tool / Skill Delta Snapshot
|
|
- Updated: 2026-04-05 22:04 (KST)
|
|
- AX tool registry count is larger than `claw-code`, but the shape is different.
|
|
- AX reference: `src/AxCopilot/Services/Agent/ToolRegistry.cs`
|
|
- `claw-code` reference: `src/tools/*`, `src/skills/bundledSkills.ts`
|
|
|
|
### AX stronger areas
|
|
- Document/office generation and conversion (`ExcelSkill`, `DocxSkill`, `PptxSkill`, `DocumentPlannerTool`, `DocumentAssemblerTool`)
|
|
- Data/business utilities (`DataPivotTool`, `SqlTool`, `FormatConvertTool`, `TextSummarizeTool`)
|
|
- WPF-integrated enterprise UX and Korean workflow presets
|
|
|
|
### claw-code stronger areas
|
|
- Transcript-native tool use / rejection / approval message taxonomy
|
|
- Plan approval request/response rendering in the message stream
|
|
- Permission and tool-result message consistency
|
|
- Bundled skill registry and skill message integration
|
|
|
|
### Remaining parity target
|
|
- Keep AX's richer business/document tool set
|
|
- Bring transcript rendering and approval/status UX closer to `claw-code`
|
|
|
|
## Transcript-First Approval / Ask UX
|
|
- Updated: 2026-04-05 18:58 (KST)
|
|
- `plan approval` and `user ask` should both resolve inside the transcript first.
|
|
- Secondary windows are allowed only as detail surfaces, not as the primary decision flow.
|
|
- AX implementation status:
|
|
- `plan approval`: transcript-first, detail view via `PlanViewerWindow`
|
|
- `user ask`: transcript-first inline question card with choices / direct input / submit
|
|
|
|
## Current Snapshot
|
|
- Updated: 2026-04-05 19:42 (KST)
|
|
- Estimated parity:
|
|
- Core engine: `89%`
|
|
- Main transcript UI: `96%`
|
|
- Cowork/Code runtime UX: `92%`
|
|
- Internal settings linkage: `88%`
|
|
- Overall AX Agent parity: `93%`
|
|
|
|
## Remaining Gaps
|
|
1. Prompt lifecycle parity
|
|
- `claw-code` reference: `src/utils/handlePromptSubmit.ts`, `src/utils/processUserInput/processTextPrompt.ts`
|
|
- AX gap:
|
|
- `send / retry / regenerate` are mostly unified, but `slash / compact 후 다음 턴 / 일부 queue 후처리`는 아직 `ChatWindow.xaml.cs`에서 UI 상태를 먼저 만지는 구간이 남아 있습니다.
|
|
- 목표는 모든 입력 진입점이 `AxAgentExecutionEngine`의 동일한 prepare/execute/finalize 축만 타게 만드는 것입니다.
|
|
|
|
2. Plan / approval rendering parity
|
|
- `claw-code` reference: `src/components/messages/PlanApprovalMessage.tsx`
|
|
- AX gap:
|
|
- 기본 transcript에서는 compact pill 위주로 줄였지만, 승인/계획 결과 표현이 아직 `Popup/Window + WPF 카드`와 섞여 있습니다.
|
|
- 목표는 “본문 우선 + 필요 시 열기” 기준으로 더 단일한 timeline 언어로 수렴시키는 것입니다.
|
|
|
|
3. Status line / composer parity
|
|
- `claw-code` reference: `src/components/StatusLine.tsx`, `src/components/PromptInput/PromptInput.tsx`
|
|
- AX gap:
|
|
- 하단 상태바와 composer 옵션은 많이 줄었지만, 상태 메타가 여전히 분산돼 있고 일부 토글/빠른 설정이 별도 행으로 남아 있습니다.
|
|
- 목표는 transcript 하단의 작업 바 한 축으로 더 압축하는 것입니다.
|
|
|
|
4. Runtime event density parity
|
|
- `claw-code` reference: `src/bridge/sessionRunner.ts`, `src/components/StatusNotices.tsx`
|
|
- AX gap:
|
|
- non-debug 기본 로그는 줄었지만, 일부 Cowork/Code 이벤트는 여전히 timeline을 자주 흔듭니다.
|
|
- 목표는 `permission / tool / error / complete / paused / resumed`를 더 안정된 event shape로 정규화하는 것입니다.
|
|
|
|
## Settings Review
|
|
- Remove candidate:
|
|
- `PlanMode`
|
|
- current state: 사용자 노출 UI와 저장 경로는 `off` 고정으로 정리됐지만 `AppSettings`, `SettingsViewModel`, `AppStateService` 타입 잔재가 남아 있음
|
|
- rationale: 현재 정책이 `off` 고정이라 사용자 선택값이 엔진에 의미 있게 기여하지 않음
|
|
- `Code.EnablePlanModeTools`
|
|
- current state: UI/저장 경로와 기본값은 `false` 고정으로 정리됐지만 모델/설정 타입에 호환용 잔재가 남아 있음
|
|
- rationale: 현재 엔진 정책에서 실제 실행 경로를 더 이상 바꾸지 않음
|
|
- Move to developer-only candidate:
|
|
- `FreeTierDelaySeconds`
|
|
- rationale: 일반 사용자가 조정할 이유가 적고 엔진 지연 정책에 직접 영향
|
|
- `MaxAgentIterations`
|
|
- `MaxRetryOnError`
|
|
- rationale: 핵심 실행 루프 품질에 직접 영향하는 런타임 튜닝값
|
|
- Keep as runtime-critical:
|
|
- `OperationMode`
|
|
- `MaxContextTokens`
|
|
- `ContextCompactTriggerPercent`
|
|
- `EnableProactiveContextCompact`
|
|
- `EnableCoworkVerification`
|
|
- `EnableCodeVerification`
|
|
- `Code.EnableWorktreeTools / EnableTeamTools / EnableCronTools`
|
|
|
|
## Known UX / Performance Risks
|
|
- Topic preset hover flicker was caused by duplicate hover systems:
|
|
- custom hover label
|
|
- default WPF `ToolTip`
|
|
- AX fix:
|
|
- remove default `ToolTip` from topic cards and keep a single hover label path
|
|
- Remaining runtime performance review targets:
|
|
- `RefreshContextUsageVisual()` frequency
|
|
- `BuildTopicButtons()` rebuild frequency
|
|
- `OnAgentEvent` timeline churn during long Cowork/Code runs
|
|
- compact queue summary still needs one more pass to fully match `claw-code` footer minimalism
|