Some checks failed
Release Gate / gate (push) Has been cancelled
- claw-code 소스 구조와 AX Agent 구조를 다시 대조해 추가 품질 향상 계획 수립 - transcript renderer 분리, permission presentation catalog, tool result taxonomy, plan approval inline 마감, runtime summary 계층화, regression prompt ritual 고정 계획 문서화 - 런타임 핵심 설정과 개발자 전용 이동 후보 설정을 구분해 정리 - README 및 DEVELOPMENT 문서에 2026-04-06 00:22 (KST) 기준 이력 반영 - dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\verify\ -p:IntermediateOutputPath=obj\verify\ 경고 0 오류 0 확인
22 KiB
22 KiB
Claw Code Parity Plan (Rewritten)
Scope
- Align AX Copilot with claw-code quality for loop reliability, permission/hook behavior, and session durability.
Update
- Updated: 2026-04-05 15:34 (KST)
- Rebased the AX Agent improvement plan on actual
claw-coderuntime files instead of earlier AX snapshots. The reference spine is nowsrc/bootstrap/state.ts -> src/bridge/initReplBridge.ts -> src/bridge/sessionRunner.ts -> src/screens/REPL.tsx -> src/components/Messages.tsx -> src/components/StatusLine.tsx. - AX Agent work should follow that same quality order: state first, execution second, render last. UI-only fixes that bypass state/execution should be treated as temporary.
- Updated: 2026-04-05 16:55 (KST)
- Current estimated parity vs
claw-code: core execution engine82%, main chat UI68%, Cowork/Code status UX63%, internal settings linkage88%, overall AX Agent74%. - Engine-affecting settings should be handled conservatively during parity work. If a setting changes the main execution route, approval flow, or recovery behavior without representing a stable real-world user choice, it should be moved to developer-only UI or removed from user-facing surfaces.
Preserved History (Summary)
- Core loop guards and post-tool verification gates are already partially implemented.
- Plan Mode, parallel tool execution, and unknown-tool recovery are in place.
- Session restore hardening is ongoing.
Reference Map
| claw-code reference | AX apply target | completion criteria | quality criteria |
|---|---|---|---|
src/bootstrap/state.ts |
src/AxCopilot/Views/ChatWindow.xaml.cs, src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs, src/AxCopilot/Services/ChatStorageService.cs |
one canonical runtime/session state for current turn, queue, retry, execution events, and persisted snapshot | reopen/retry/queue flows do not create duplicate or blank assistant messages |
src/bridge/initReplBridge.ts |
src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs, src/AxCopilot/Services/LlmService.cs |
send/regenerate/retry/queued follow-up/slash all enter through one prepared-execution path | same input under same settings takes same execution route regardless of entry point |
src/bridge/sessionRunner.ts |
src/AxCopilot/Services/Agent/AgentLoopService.cs, src/AxCopilot/Services/Agent/AgentLoopTransitions.cs, src/AxCopilot/Services/Agent/AgentLoopTransitions.Execution.cs |
tool start/result/error/progress normalized once inside loop layer | Cowork/Code no longer flash repeated status strings or overshare debug payloads |
src/bridge/bridgeMessaging.ts |
src/AxCopilot/Views/ChatWindow.xaml.cs, src/AxCopilot/Services/Agent/AgentLoopService.cs |
inbound execution events separated from display-only events before UI render | execution event replay does not duplicate visible timeline banners |
src/screens/REPL.tsx |
src/AxCopilot/Views/ChatWindow.xaml, src/AxCopilot/Views/ChatWindow.xaml.cs |
screen state transitions, queue flow, retry flow, and composer state use shared runtime helpers | window resize, queue chaining, and retry feel stable instead of UI-patched |
src/components/Messages.tsx |
src/AxCopilot/Views/ChatWindow.xaml.cs |
timeline derives from normalized conversation/session state only | no token-only completions, blank cards, or direct injected duplicates |
src/components/StatusLine.tsx |
src/AxCopilot/Views/ChatWindow.xaml, src/AxCopilot/Views/ChatWindow.xaml.cs |
status strip computed from debounced runtime state, not multiple imperative refresh calls | metadata stays lightweight and does not overpower message timeline |
AX Agent Improvement Phases
Phase A. Runtime State Canonicalization
- Reference:
src/bootstrap/state.ts - AX apply location:
src/AxCopilot/Views/ChatWindow.xaml.cs,src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs,src/AxCopilot/Services/ChatStorageService.cs - Completion criteria:
Chat,Cowork,Codeall update one shared runtime/session state model.- queue, retry, post-compaction, and execution-event state can be restored after reopen.
- Quality criteria:
- reopening a conversation reproduces the same visible timeline without extra assistant cards.
- queue and execution badges remain in sync with the stored conversation.
Phase B. Prepared Execution Unification
- Reference:
src/bridge/initReplBridge.ts - AX apply location:
src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs,src/AxCopilot/Services/LlmService.cs - Completion criteria:
- prompt stack assembly, execution mode choice, and final assistant commit are engine-owned.
- send/regenerate/retry/queued follow-up/slash flows all call the same preparation API.
- Quality criteria:
- behavior is deterministic per tab/settings combination.
- UI stops building different prompt stacks for the same conversation state.
Phase C. AgentLoop Event Normalization
- Reference:
src/bridge/sessionRunner.ts,src/bridge/bridgeMessaging.ts - AX apply location:
src/AxCopilot/Services/Agent/AgentLoopService.cs,src/AxCopilot/Services/Agent/AgentLoopTransitions.cs,src/AxCopilot/Services/Agent/AgentLoopTransitions.Execution.cs - Completion criteria:
- loop events are normalized into bounded activity/event records before UI consumption.
- permission requests, failure states, retries, and completion states use a stable event shape.
- Quality criteria:
- Cowork/Code no longer flash rapidly during long-running tool sequences.
- file path/debug detail remains collapsed by default.
Phase D. Timeline Render Parity
- Reference:
src/screens/REPL.tsx,src/components/Messages.tsx - AX apply location:
src/AxCopilot/Views/ChatWindow.xaml,src/AxCopilot/Views/ChatWindow.xaml.cs - Completion criteria:
- assistant/user messages, execution logs, compact boundaries, and queue summaries are rendered from one derived timeline model.
- direct imperative bubble injection is removed from normal send/regenerate/retry flows.
- Quality criteria:
- no blank assistant cards.
- no token-only completion without visible content.
- no duplicate event banners after re-render.
Phase E. Composer and Status Strip Simplification
- Reference:
src/screens/REPL.tsx,src/components/StatusLine.tsx - AX apply location:
src/AxCopilot/Views/ChatWindow.xaml,src/AxCopilot/Views/ChatWindow.xaml.cs - Completion criteria:
- composer height grows only on explicit line breaks.
- status strip, queue summary, and runtime activity all use debounced runtime updates.
- Chat/Cowork/Code share one responsive width calculation policy.
- Quality criteria:
- resizing feels natural.
- composer does not keep growing after send.
- metadata remains subordinate to the message timeline.
Phase F. Recovery, Resume, and Verification
- Reference:
src/bootstrap/state.ts,src/bridge/sessionRunner.ts,src/screens/REPL.tsx - AX apply location:
src/AxCopilot/Views/ChatWindow.xaml.cs,src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs,src/AxCopilot/Services/ChatStorageService.cs - Completion criteria:
- reopen after interruption keeps queue, runtime summary, and latest visible assistant state consistent.
- retry-last and regenerate do not depend on mutating
InputBox.Text. - all three tabs pass reopen/retry/manual compact/manual stop/manual resume scenarios.
- Quality criteria:
- stored conversation and rendered conversation stay identical after restore.
- final reopened state matches the last completed runtime state.
Execution Tracks
- Hook contract parity
- Structured hook output support (
updatedInput,updatedPermissions,additionalContext). - Runtime gating through settings toggles.
- Session/state parity
- Deterministic run resume rules.
- Stable jsonl event schema + replay compatibility.
- Recovery parity
- Failure-type classification and standardized retry guidance.
- Reduced repeated wrong-tool loops.
- Completion parity
- Evidence-based finalization criteria for code/document tasks.
Done Criteria
- Internal parity scenarios pass target threshold.
- Resume/replay failures: zero.
dotnet buildwarnings/errors: zero.
Validation Matrix
- Build:
dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\\verify\\ -p:IntermediateOutputPath=obj\\verify\\ - Manual scenario 1: Chat send -> answer visible -> retry -> regenerate -> reopen conversation
- Manual scenario 2: Cowork tool run -> progress summary -> completion -> queue next request -> reopen
- Manual scenario 3: Code task with execution log noise -> completion -> compact -> next turn -> reopen
- Manual scenario 4: AX Agent internal settings change -> immediate runtime reflection without layout regression
Canonical Prompt Set
- Updated: 2026-04-05 22:04 (KST)
- The following prompt set should be used for AX vs
claw-codeparity checks. The goal is not byte-identical output, but equivalent execution route, approval behavior, and artifact/result quality. - Operational checklist copy:
docs/AX_AGENT_REGRESSION_PROMPTS.md
- Chat basic answer
- Prompt:
회의 일정 조정 메일을 정중한 한국어로 써줘 - Apply to:
Chat - Verify: normal reply render, retry/regenerate stability, reopen durability
- Chat long-form explanation
- Prompt:
RAG와 fine-tuning 차이를 실무 관점으로 7가지로 설명해줘 - Apply to:
Chat - Verify: long response rendering, compaction follow-up continuity
- Cowork document task
- Prompt:
신규 ERP 도입 제안서 초안을 작성해줘. 목적, 범위, 기대효과, 추진일정 포함 - Apply to:
Cowork - Verify: topic/task preset routing, plan-first execution, actual document-oriented output path
- Cowork data task
- Prompt:
매출 CSV를 분석해서 월별 추세와 이상치를 요약해줘 - Apply to:
Cowork - Verify: data-analysis tool choice, reduced runtime noise, final summary quality
- Code bug-fix task
- Prompt:
현재 프로젝트에서 설정 저장 버그 원인 찾고 수정해줘 - Apply to:
Code - Verify: read/search/edit path, diff persistence, reopen consistency
- Code build/test task
- Prompt:
빌드 오류를 재현하고 수정한 뒤 다시 빌드해줘 - Apply to:
Code - Verify: build/test loop, failure retry, final completion message
- Queued follow-up
- Prompt sequence:
이 창 레이아웃 문제 원인 찾아줘끝나면 README도 같이 갱신해줘
- Apply to:
Cowork,Code - Verify: queue chaining, next-turn pickup without UI mutation
- Post-compaction continuity
- Prompt:
지금까지 논의한 내용을 5줄로 이어서 정리하고 다음 작업 제안해줘 - Apply to:
Chat,Cowork,Code - Verify: compact-after-next-turn continuity, no token-only completion
- Permission approval
- Prompt:
이 파일을 수정해서 저장해줘 - Apply to:
Code - Verify: permission request, approve/reject rendering, final transcript consistency
- Slash / skill entry
- Prompt:
/bug-hunt src 폴더 잠재 버그 찾아줘 - Apply to:
Code - Verify: slash entry uses the same prepared-execution route as normal send
Tool / Skill Delta Snapshot
- Updated: 2026-04-05 22:04 (KST)
- AX tool registry count is larger than
claw-code, but the shape is different. - AX reference:
src/AxCopilot/Services/Agent/ToolRegistry.cs claw-codereference:src/tools/*,src/skills/bundledSkills.ts
AX stronger areas
- Document/office generation and conversion (
ExcelSkill,DocxSkill,PptxSkill,DocumentPlannerTool,DocumentAssemblerTool) - Data/business utilities (
DataPivotTool,SqlTool,FormatConvertTool,TextSummarizeTool) - WPF-integrated enterprise UX and Korean workflow presets
claw-code stronger areas
- Transcript-native tool use / rejection / approval message taxonomy
- Plan approval request/response rendering in the message stream
- Permission and tool-result message consistency
- Bundled skill registry and skill message integration
Remaining parity target
- Keep AX's richer business/document tool set
- Bring transcript rendering and approval/status UX closer to
claw-code
Transcript-First Approval / Ask UX
- Updated: 2026-04-05 18:58 (KST)
plan approvalanduser askshould both resolve inside the transcript first.- Secondary windows are allowed only as detail surfaces, not as the primary decision flow.
- AX implementation status:
plan approval: transcript-first, detail view viaPlanViewerWindowuser ask: transcript-first inline question card with choices / direct input / submit
Tool / Skill UX Parity Follow-up
- Updated: 2026-04-05 19:04 (KST)
- Default transcript should prefer role-oriented badges and readable labels over raw internal tool names.
- AX implementation status:
- tool event badges: simplified to role-first labels
- item naming: normalized into readable Korean labels or
/skill-namestyle - observability panels: permission/background diagnostics reduced outside debug mode
- Remaining quality target:
- move more tool-result and permission-result presentation into smaller message-type-specific helpers, closer to
claw-codecomponent separation
- move more tool-result and permission-result presentation into smaller message-type-specific helpers, closer to
Current Snapshot
- Updated: 2026-04-05 19:42 (KST)
- Estimated parity:
- Core engine:
89% - Main transcript UI:
96% - Cowork/Code runtime UX:
92% - Internal settings linkage:
88% - Overall AX Agent parity:
93%
- Core engine:
Remaining Gaps
- Prompt lifecycle parity
claw-codereference:src/utils/handlePromptSubmit.ts,src/utils/processUserInput/processTextPrompt.ts- AX gap:
send / retry / regenerateare mostly unified, butslash / compact 후 다음 턴 / 일부 queue 후처리는 아직ChatWindow.xaml.cs에서 UI 상태를 먼저 만지는 구간이 남아 있습니다.- 목표는 모든 입력 진입점이
AxAgentExecutionEngine의 동일한 prepare/execute/finalize 축만 타게 만드는 것입니다.
- Plan / approval rendering parity
claw-codereference:src/components/messages/PlanApprovalMessage.tsx- AX gap:
- 기본 transcript에서는 compact pill 위주로 줄였지만, 승인/계획 결과 표현이 아직
Popup/Window + WPF 카드와 섞여 있습니다.
Quality Uplift Plan
- Updated: 2026-04-06 00:22 (KST)
- Goal: move AX Agent from parity-oriented stability into
claw-code-grade maintainability and transcript quality, without copying implementation expression.
Track 1. Transcript Renderer Decomposition
claw-codereferences:src/components/Messages.tsxsrc/components/MessageRow.tsxsrc/components/messages/AssistantToolUseMessage.tsxsrc/components/messages/PlanApprovalMessage.tsx
- AX apply targets:
src/AxCopilot/Views/ChatWindow.xaml.cs- new partial/helper files under
src/AxCopilot/Views/
- Completion criteria:
plan / permission / ask / tool-result / task-summaryrendering no longer lives as one large block insideChatWindow.xaml.cs- each transcript concern has a dedicated helper/partial/class boundary
- Quality criteria:
- render changes for one message type do not regress unrelated timeline behavior
- transcript behavior remains stable after reopen / retry / regenerate
Track 2. Permission Presentation Catalog
claw-codereferences:src/components/permissions/PermissionRequest.tsxsrc/components/permissions/PermissionDialog.tsx- tool-specific permission request components under
src/components/permissions/*
- AX apply targets:
src/AxCopilot/Services/Agent/PermissionModeCatalog.cs- new
src/AxCopilot/Services/Agent/PermissionRequestPresentationCatalog.cs src/AxCopilot/Views/ChatWindow.xaml.cs
- Completion criteria:
- permission request title, subtitle, icon, severity, and choice set are resolved by tool/request type
- file edit / shell / skill / ask-user / web-like permission requests use distinct presentation metadata
- Quality criteria:
- permission prompts feel explicit and predictable
- user can distinguish request type without reading raw tool names or payload
Track 3. Tool Result Message Taxonomy
claw-codereferences:src/components/messages/UserToolResultMessage/UserToolSuccessMessage.tsxsrc/components/messages/UserToolResultMessage/UserToolErrorMessage.tsxsrc/components/messages/UserToolResultMessage/UserToolRejectMessage.tsxsrc/components/messages/UserToolResultMessage/UserToolCanceledMessage.tsx
- AX apply targets:
- new
src/AxCopilot/Services/Agent/ToolResultPresentationCatalog.cs src/AxCopilot/Views/ChatWindow.TranscriptPolicy.cssrc/AxCopilot/Views/ChatWindow.xaml.cs
- new
- Completion criteria:
- transcript display rules differ for
success / error / reject / cancel - tool-result badges and summaries are resolved from presentation metadata instead of inline ad-hoc branches
- transcript display rules differ for
- Quality criteria:
- result cards read as stable UX language, not raw execution logs
- failed and rejected tool runs are visually distinct without increasing noise
Track 4. Plan Approval Transcript-Only Flow
claw-codereferences:src/components/messages/PlanApprovalMessage.tsxsrc/components/messages/UserPlanMessage.tsx
- AX apply targets:
src/AxCopilot/Views/ChatWindow.xaml.cssrc/AxCopilot/Views/PlanViewerWindow.cs
- Completion criteria:
- default approval / reject / revise flow completes inline in transcript
PlanViewerWindowis detail-only and never required for primary approval flow
- Quality criteria:
- planning feels like part of the conversation, not a modal interruption
- approval history is replayable from persisted conversation state
Track 5. Runtime Summary Layer
claw-codereferences:src/components/StatusLine.tsxsrc/components/PromptInput/PromptInputFooter.tsxsrc/bootstrap/state.ts
- AX apply targets:
src/AxCopilot/Services/AppStateService.cssrc/AxCopilot/Views/ChatWindow.xaml.cs
- Completion criteria:
- one runtime/status summary model feeds the status line, queue summary, runtime badge, and completion hint
- status rendering no longer depends on scattered imperative refresh branches
- Quality criteria:
- no contradictory or stale runtime badges
- long-running Cowork/Code sessions stay visually calm
Track 6. Regression Prompt Ritual
claw-codereferences:- runtime validation scenarios implied by
sessionRunner,Messages,StatusLine, and permission components
- runtime validation scenarios implied by
- AX apply targets:
docs/AX_AGENT_REGRESSION_PROMPTS.mddocs/claw-code-parity-plan.md- developer workflow / release checklist
- Completion criteria:
- Chat / Cowork / Code prompt set is treated as mandatory regression for runtime-affecting changes
- each prompt is mapped to a failure class (
blank reply,duplicate banner,bad approval flow,queue drift,restore drift)
- Quality criteria:
- parity claims are based on repeatable checks instead of visual spot-checks
- regressions are easier to catch before release
Recommended Execution Order
- Transcript renderer decomposition
- Permission presentation catalog
- Tool result taxonomy
- Plan approval transcript-only flow
- Runtime summary layer
- Regression prompt ritual hardening
Settings and Logic Review
- Updated: 2026-04-06 00:22 (KST)
- Candidate to move to developer-only:
FreeTierDelaySecondsMaxAgentIterationsMaxRetryOnError
- Keep as runtime-critical user settings:
OperationModeMaxContextTokensContextCompactTriggerPercentEnableProactiveContextCompactEnableCoworkVerificationEnableCodeVerification- code tool exposure toggles
- Rule:
- if a setting changes the main execution route or recovery semantics without representing a stable real-world user choice, move it out of default user-facing surfaces
- 목표는 “본문 우선 + 필요 시 열기” 기준으로 더 단일한 timeline 언어로 수렴시키는 것입니다.
- Status line / composer parity
claw-codereference:src/components/StatusLine.tsx,src/components/PromptInput/PromptInput.tsx- AX gap:
- 하단 상태바와 composer 옵션은 많이 줄었지만, 상태 메타가 여전히 분산돼 있고 일부 토글/빠른 설정이 별도 행으로 남아 있습니다.
- 목표는 transcript 하단의 작업 바 한 축으로 더 압축하는 것입니다.
- Runtime event density parity
claw-codereference:src/bridge/sessionRunner.ts,src/components/StatusNotices.tsx- AX gap:
- non-debug 기본 로그는 줄었지만, 일부 Cowork/Code 이벤트는 여전히 timeline을 자주 흔듭니다.
- 목표는
permission / tool / error / complete / paused / resumed를 더 안정된 event shape로 정규화하는 것입니다.
Settings Review
- Remove candidate:
PlanMode- current state: 사용자 노출 UI와 저장 경로는
off고정으로 정리됐지만AppSettings,SettingsViewModel,AppStateService타입 잔재가 남아 있음 - rationale: 현재 정책이
off고정이라 사용자 선택값이 엔진에 의미 있게 기여하지 않음
- current state: 사용자 노출 UI와 저장 경로는
Code.EnablePlanModeTools- current state: UI/저장 경로와 기본값은
false고정으로 정리됐지만 모델/설정 타입에 호환용 잔재가 남아 있음 - rationale: 현재 엔진 정책에서 실제 실행 경로를 더 이상 바꾸지 않음
- current state: UI/저장 경로와 기본값은
- Move to developer-only candidate:
FreeTierDelaySeconds- rationale: 일반 사용자가 조정할 이유가 적고 엔진 지연 정책에 직접 영향
MaxAgentIterationsMaxRetryOnError- rationale: 핵심 실행 루프 품질에 직접 영향하는 런타임 튜닝값
- Keep as runtime-critical:
OperationModeMaxContextTokensContextCompactTriggerPercentEnableProactiveContextCompactEnableCoworkVerificationEnableCodeVerificationCode.EnableWorktreeTools / EnableTeamTools / EnableCronTools
Known UX / Performance Risks
- Topic preset hover flicker was caused by duplicate hover systems:
- custom hover label
- default WPF
ToolTip
- AX fix:
- remove default
ToolTipfrom topic cards and keep a single hover label path
- remove default
- Remaining runtime performance review targets:
RefreshContextUsageVisual()frequencyBuildTopicButtons()rebuild frequencyOnAgentEventtimeline churn during long Cowork/Code runs- compact queue summary still needs one more pass to fully match
claw-codefooter minimalism