Files

Release Gate / gate (push) Has been cancelled

Details

- claw-code 소스 구조와 AX Agent 구조를 다시 대조해 추가 품질 향상 계획 수립
- transcript renderer 분리, permission presentation catalog, tool result taxonomy, plan approval inline 마감, runtime summary 계층화, regression prompt ritual 고정 계획 문서화
- 런타임 핵심 설정과 개발자 전용 이동 후보 설정을 구분해 정리
- README 및 DEVELOPMENT 문서에 2026-04-06 00:22 (KST) 기준 이력 반영
- dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\verify\ -p:IntermediateOutputPath=obj\verify\ 경고 0 오류 0 확인

2026-04-05 21:26:25 +09:00

22 KiB

Raw Blame History

Claw Code Parity Plan (Rewritten)

Scope

Align AX Copilot with claw-code quality for loop reliability, permission/hook behavior, and session durability.

Update

Updated: 2026-04-05 15:34 (KST)
Rebased the AX Agent improvement plan on actual claw-code runtime files instead of earlier AX snapshots. The reference spine is now src/bootstrap/state.ts -> src/bridge/initReplBridge.ts -> src/bridge/sessionRunner.ts -> src/screens/REPL.tsx -> src/components/Messages.tsx -> src/components/StatusLine.tsx.
AX Agent work should follow that same quality order: state first, execution second, render last. UI-only fixes that bypass state/execution should be treated as temporary.
Updated: 2026-04-05 16:55 (KST)
Current estimated parity vs claw-code: core execution engine 82%, main chat UI 68%, Cowork/Code status UX 63%, internal settings linkage 88%, overall AX Agent 74%.
Engine-affecting settings should be handled conservatively during parity work. If a setting changes the main execution route, approval flow, or recovery behavior without representing a stable real-world user choice, it should be moved to developer-only UI or removed from user-facing surfaces.

Preserved History (Summary)

Core loop guards and post-tool verification gates are already partially implemented.
Plan Mode, parallel tool execution, and unknown-tool recovery are in place.
Session restore hardening is ongoing.

Reference Map

claw-code reference	AX apply target	completion criteria	quality criteria
`src/bootstrap/state.ts`	`src/AxCopilot/Views/ChatWindow.xaml.cs`, `src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs`, `src/AxCopilot/Services/ChatStorageService.cs`	one canonical runtime/session state for current turn, queue, retry, execution events, and persisted snapshot	reopen/retry/queue flows do not create duplicate or blank assistant messages
`src/bridge/initReplBridge.ts`	`src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs`, `src/AxCopilot/Services/LlmService.cs`	send/regenerate/retry/queued follow-up/slash all enter through one prepared-execution path	same input under same settings takes same execution route regardless of entry point
`src/bridge/sessionRunner.ts`	`src/AxCopilot/Services/Agent/AgentLoopService.cs`, `src/AxCopilot/Services/Agent/AgentLoopTransitions.cs`, `src/AxCopilot/Services/Agent/AgentLoopTransitions.Execution.cs`	tool start/result/error/progress normalized once inside loop layer	Cowork/Code no longer flash repeated status strings or overshare debug payloads
`src/bridge/bridgeMessaging.ts`	`src/AxCopilot/Views/ChatWindow.xaml.cs`, `src/AxCopilot/Services/Agent/AgentLoopService.cs`	inbound execution events separated from display-only events before UI render	execution event replay does not duplicate visible timeline banners
`src/screens/REPL.tsx`	`src/AxCopilot/Views/ChatWindow.xaml`, `src/AxCopilot/Views/ChatWindow.xaml.cs`	screen state transitions, queue flow, retry flow, and composer state use shared runtime helpers	window resize, queue chaining, and retry feel stable instead of UI-patched
`src/components/Messages.tsx`	`src/AxCopilot/Views/ChatWindow.xaml.cs`	timeline derives from normalized conversation/session state only	no token-only completions, blank cards, or direct injected duplicates
`src/components/StatusLine.tsx`	`src/AxCopilot/Views/ChatWindow.xaml`, `src/AxCopilot/Views/ChatWindow.xaml.cs`	status strip computed from debounced runtime state, not multiple imperative refresh calls	metadata stays lightweight and does not overpower message timeline

AX Agent Improvement Phases

Phase A. Runtime State Canonicalization

Reference: src/bootstrap/state.ts
AX apply location: src/AxCopilot/Views/ChatWindow.xaml.cs, src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs, src/AxCopilot/Services/ChatStorageService.cs
Completion criteria:
- Chat, Cowork, Code all update one shared runtime/session state model.
- queue, retry, post-compaction, and execution-event state can be restored after reopen.
Quality criteria:
- reopening a conversation reproduces the same visible timeline without extra assistant cards.
- queue and execution badges remain in sync with the stored conversation.

Phase B. Prepared Execution Unification

Reference: src/bridge/initReplBridge.ts
AX apply location: src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs, src/AxCopilot/Services/LlmService.cs
Completion criteria:
- prompt stack assembly, execution mode choice, and final assistant commit are engine-owned.
- send/regenerate/retry/queued follow-up/slash flows all call the same preparation API.
Quality criteria:
- behavior is deterministic per tab/settings combination.
- UI stops building different prompt stacks for the same conversation state.

Phase C. AgentLoop Event Normalization

Reference: src/bridge/sessionRunner.ts, src/bridge/bridgeMessaging.ts
AX apply location: src/AxCopilot/Services/Agent/AgentLoopService.cs, src/AxCopilot/Services/Agent/AgentLoopTransitions.cs, src/AxCopilot/Services/Agent/AgentLoopTransitions.Execution.cs
Completion criteria:
- loop events are normalized into bounded activity/event records before UI consumption.
- permission requests, failure states, retries, and completion states use a stable event shape.
Quality criteria:
- Cowork/Code no longer flash rapidly during long-running tool sequences.
- file path/debug detail remains collapsed by default.

Phase D. Timeline Render Parity

Reference: src/screens/REPL.tsx, src/components/Messages.tsx
AX apply location: src/AxCopilot/Views/ChatWindow.xaml, src/AxCopilot/Views/ChatWindow.xaml.cs
Completion criteria:
- assistant/user messages, execution logs, compact boundaries, and queue summaries are rendered from one derived timeline model.
- direct imperative bubble injection is removed from normal send/regenerate/retry flows.
Quality criteria:
- no blank assistant cards.
- no token-only completion without visible content.
- no duplicate event banners after re-render.

Phase E. Composer and Status Strip Simplification

Reference: src/screens/REPL.tsx, src/components/StatusLine.tsx
AX apply location: src/AxCopilot/Views/ChatWindow.xaml, src/AxCopilot/Views/ChatWindow.xaml.cs
Completion criteria:
- composer height grows only on explicit line breaks.
- status strip, queue summary, and runtime activity all use debounced runtime updates.
- Chat/Cowork/Code share one responsive width calculation policy.
Quality criteria:
- resizing feels natural.
- composer does not keep growing after send.
- metadata remains subordinate to the message timeline.

Phase F. Recovery, Resume, and Verification

Reference: src/bootstrap/state.ts, src/bridge/sessionRunner.ts, src/screens/REPL.tsx
AX apply location: src/AxCopilot/Views/ChatWindow.xaml.cs, src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs, src/AxCopilot/Services/ChatStorageService.cs
Completion criteria:
- reopen after interruption keeps queue, runtime summary, and latest visible assistant state consistent.
- retry-last and regenerate do not depend on mutating InputBox.Text.
- all three tabs pass reopen/retry/manual compact/manual stop/manual resume scenarios.
Quality criteria:
- stored conversation and rendered conversation stay identical after restore.
- final reopened state matches the last completed runtime state.

Execution Tracks

Hook contract parity

Structured hook output support (updatedInput, updatedPermissions, additionalContext).
Runtime gating through settings toggles.

Session/state parity

Deterministic run resume rules.
Stable jsonl event schema + replay compatibility.

Recovery parity

Failure-type classification and standardized retry guidance.
Reduced repeated wrong-tool loops.

Completion parity

Evidence-based finalization criteria for code/document tasks.

Done Criteria

Internal parity scenarios pass target threshold.
Resume/replay failures: zero.
dotnet build warnings/errors: zero.

Validation Matrix

Build: dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\\verify\\ -p:IntermediateOutputPath=obj\\verify\\
Manual scenario 1: Chat send -> answer visible -> retry -> regenerate -> reopen conversation
Manual scenario 2: Cowork tool run -> progress summary -> completion -> queue next request -> reopen
Manual scenario 3: Code task with execution log noise -> completion -> compact -> next turn -> reopen
Manual scenario 4: AX Agent internal settings change -> immediate runtime reflection without layout regression

Canonical Prompt Set

Updated: 2026-04-05 22:04 (KST)
The following prompt set should be used for AX vs claw-code parity checks. The goal is not byte-identical output, but equivalent execution route, approval behavior, and artifact/result quality.
Operational checklist copy: docs/AX_AGENT_REGRESSION_PROMPTS.md

Chat basic answer

Prompt: 회의 일정 조정 메일을 정중한 한국어로 써줘
Apply to: Chat
Verify: normal reply render, retry/regenerate stability, reopen durability

Chat long-form explanation

Prompt: RAG와 fine-tuning 차이를 실무 관점으로 7가지로 설명해줘
Apply to: Chat
Verify: long response rendering, compaction follow-up continuity

Cowork document task

Prompt: 신규 ERP 도입 제안서 초안을 작성해줘. 목적, 범위, 기대효과, 추진일정 포함
Apply to: Cowork
Verify: topic/task preset routing, plan-first execution, actual document-oriented output path

Cowork data task

Prompt: 매출 CSV를 분석해서 월별 추세와 이상치를 요약해줘
Apply to: Cowork
Verify: data-analysis tool choice, reduced runtime noise, final summary quality

Code bug-fix task

Prompt: 현재 프로젝트에서 설정 저장 버그 원인 찾고 수정해줘
Apply to: Code
Verify: read/search/edit path, diff persistence, reopen consistency

Code build/test task

Prompt: 빌드 오류를 재현하고 수정한 뒤 다시 빌드해줘
Apply to: Code
Verify: build/test loop, failure retry, final completion message

Queued follow-up

Prompt sequence:
- 이 창 레이아웃 문제 원인 찾아줘
- 끝나면 README도 같이 갱신해줘
Apply to: Cowork, Code
Verify: queue chaining, next-turn pickup without UI mutation

Post-compaction continuity

Prompt: 지금까지 논의한 내용을 5줄로 이어서 정리하고 다음 작업 제안해줘
Apply to: Chat, Cowork, Code
Verify: compact-after-next-turn continuity, no token-only completion

Permission approval

Prompt: 이 파일을 수정해서 저장해줘
Apply to: Code
Verify: permission request, approve/reject rendering, final transcript consistency

Slash / skill entry

Prompt: /bug-hunt src 폴더 잠재 버그 찾아줘
Apply to: Code
Verify: slash entry uses the same prepared-execution route as normal send

Tool / Skill Delta Snapshot

Updated: 2026-04-05 22:04 (KST)
AX tool registry count is larger than claw-code, but the shape is different.
AX reference: src/AxCopilot/Services/Agent/ToolRegistry.cs
claw-code reference: src/tools/*, src/skills/bundledSkills.ts

AX stronger areas

Document/office generation and conversion (ExcelSkill, DocxSkill, PptxSkill, DocumentPlannerTool, DocumentAssemblerTool)
Data/business utilities (DataPivotTool, SqlTool, FormatConvertTool, TextSummarizeTool)
WPF-integrated enterprise UX and Korean workflow presets

claw-code stronger areas

Transcript-native tool use / rejection / approval message taxonomy
Plan approval request/response rendering in the message stream
Permission and tool-result message consistency
Bundled skill registry and skill message integration

Remaining parity target

Keep AX's richer business/document tool set
Bring transcript rendering and approval/status UX closer to claw-code

Transcript-First Approval / Ask UX

Updated: 2026-04-05 18:58 (KST)
plan approval and user ask should both resolve inside the transcript first.
Secondary windows are allowed only as detail surfaces, not as the primary decision flow.
AX implementation status:
- plan approval: transcript-first, detail view via PlanViewerWindow
- user ask: transcript-first inline question card with choices / direct input / submit

Tool / Skill UX Parity Follow-up

Updated: 2026-04-05 19:04 (KST)
Default transcript should prefer role-oriented badges and readable labels over raw internal tool names.
AX implementation status:
- tool event badges: simplified to role-first labels
- item naming: normalized into readable Korean labels or /skill-name style
- observability panels: permission/background diagnostics reduced outside debug mode
Remaining quality target:
- move more tool-result and permission-result presentation into smaller message-type-specific helpers, closer to claw-code component separation

Current Snapshot

Updated: 2026-04-05 19:42 (KST)
Estimated parity:
- Core engine: 89%
- Main transcript UI: 96%
- Cowork/Code runtime UX: 92%
- Internal settings linkage: 88%
- Overall AX Agent parity: 93%

Remaining Gaps

Prompt lifecycle parity

claw-code reference: src/utils/handlePromptSubmit.ts, src/utils/processUserInput/processTextPrompt.ts
AX gap:
- send / retry / regenerate are mostly unified, but slash / compact 후 다음 턴 / 일부 queue 후처리는 아직 ChatWindow.xaml.cs에서 UI 상태를 먼저 만지는 구간이 남아 있습니다.
- 목표는 모든 입력 진입점이 AxAgentExecutionEngine의 동일한 prepare/execute/finalize 축만 타게 만드는 것입니다.

Plan / approval rendering parity

claw-code reference: src/components/messages/PlanApprovalMessage.tsx
AX gap:
기본 transcript에서는 compact pill 위주로 줄였지만, 승인/계획 결과 표현이 아직 Popup/Window + WPF 카드와 섞여 있습니다.

Quality Uplift Plan

Updated: 2026-04-06 00:22 (KST)
Goal: move AX Agent from parity-oriented stability into claw-code-grade maintainability and transcript quality, without copying implementation expression.

Track 1. Transcript Renderer Decomposition

claw-code references:
- src/components/Messages.tsx
- src/components/MessageRow.tsx
- src/components/messages/AssistantToolUseMessage.tsx
- src/components/messages/PlanApprovalMessage.tsx
AX apply targets:
- src/AxCopilot/Views/ChatWindow.xaml.cs
- new partial/helper files under src/AxCopilot/Views/
Completion criteria:
- plan / permission / ask / tool-result / task-summary rendering no longer lives as one large block inside ChatWindow.xaml.cs
- each transcript concern has a dedicated helper/partial/class boundary
Quality criteria:
- render changes for one message type do not regress unrelated timeline behavior
- transcript behavior remains stable after reopen / retry / regenerate

Track 2. Permission Presentation Catalog

claw-code references:
- src/components/permissions/PermissionRequest.tsx
- src/components/permissions/PermissionDialog.tsx
- tool-specific permission request components under src/components/permissions/*
AX apply targets:
- src/AxCopilot/Services/Agent/PermissionModeCatalog.cs
- new src/AxCopilot/Services/Agent/PermissionRequestPresentationCatalog.cs
- src/AxCopilot/Views/ChatWindow.xaml.cs
Completion criteria:
- permission request title, subtitle, icon, severity, and choice set are resolved by tool/request type
- file edit / shell / skill / ask-user / web-like permission requests use distinct presentation metadata
Quality criteria:
- permission prompts feel explicit and predictable
- user can distinguish request type without reading raw tool names or payload

Track 3. Tool Result Message Taxonomy

claw-code references:
- src/components/messages/UserToolResultMessage/UserToolSuccessMessage.tsx
- src/components/messages/UserToolResultMessage/UserToolErrorMessage.tsx
- src/components/messages/UserToolResultMessage/UserToolRejectMessage.tsx
- src/components/messages/UserToolResultMessage/UserToolCanceledMessage.tsx
AX apply targets:
- new src/AxCopilot/Services/Agent/ToolResultPresentationCatalog.cs
- src/AxCopilot/Views/ChatWindow.TranscriptPolicy.cs
- src/AxCopilot/Views/ChatWindow.xaml.cs
Completion criteria:
- transcript display rules differ for success / error / reject / cancel
- tool-result badges and summaries are resolved from presentation metadata instead of inline ad-hoc branches
Quality criteria:
- result cards read as stable UX language, not raw execution logs
- failed and rejected tool runs are visually distinct without increasing noise

Track 4. Plan Approval Transcript-Only Flow

claw-code references:
- src/components/messages/PlanApprovalMessage.tsx
- src/components/messages/UserPlanMessage.tsx
AX apply targets:
- src/AxCopilot/Views/ChatWindow.xaml.cs
- src/AxCopilot/Views/PlanViewerWindow.cs
Completion criteria:
- default approval / reject / revise flow completes inline in transcript
- PlanViewerWindow is detail-only and never required for primary approval flow
Quality criteria:
- planning feels like part of the conversation, not a modal interruption
- approval history is replayable from persisted conversation state

Track 5. Runtime Summary Layer

claw-code references:
- src/components/StatusLine.tsx
- src/components/PromptInput/PromptInputFooter.tsx
- src/bootstrap/state.ts
AX apply targets:
- src/AxCopilot/Services/AppStateService.cs
- src/AxCopilot/Views/ChatWindow.xaml.cs
Completion criteria:
- one runtime/status summary model feeds the status line, queue summary, runtime badge, and completion hint
- status rendering no longer depends on scattered imperative refresh branches
Quality criteria:
- no contradictory or stale runtime badges
- long-running Cowork/Code sessions stay visually calm

Track 6. Regression Prompt Ritual

claw-code references:
- runtime validation scenarios implied by sessionRunner, Messages, StatusLine, and permission components
AX apply targets:
- docs/AX_AGENT_REGRESSION_PROMPTS.md
- docs/claw-code-parity-plan.md
- developer workflow / release checklist
Completion criteria:
- Chat / Cowork / Code prompt set is treated as mandatory regression for runtime-affecting changes
- each prompt is mapped to a failure class (blank reply, duplicate banner, bad approval flow, queue drift, restore drift)
Quality criteria:
- parity claims are based on repeatable checks instead of visual spot-checks
- regressions are easier to catch before release

Recommended Execution Order

Transcript renderer decomposition
Permission presentation catalog
Tool result taxonomy
Plan approval transcript-only flow
Runtime summary layer
Regression prompt ritual hardening

Settings and Logic Review

Updated: 2026-04-06 00:22 (KST)
Candidate to move to developer-only:
- FreeTierDelaySeconds
- MaxAgentIterations
- MaxRetryOnError
Keep as runtime-critical user settings:
- OperationMode
- MaxContextTokens
- ContextCompactTriggerPercent
- EnableProactiveContextCompact
- EnableCoworkVerification
- EnableCodeVerification
- code tool exposure toggles
Rule:
- if a setting changes the main execution route or recovery semantics without representing a stable real-world user choice, move it out of default user-facing surfaces
- 목표는 “본문 우선 + 필요 시 열기” 기준으로 더 단일한 timeline 언어로 수렴시키는 것입니다.

Status line / composer parity

claw-code reference: src/components/StatusLine.tsx, src/components/PromptInput/PromptInput.tsx
AX gap:
- 하단 상태바와 composer 옵션은 많이 줄었지만, 상태 메타가 여전히 분산돼 있고 일부 토글/빠른 설정이 별도 행으로 남아 있습니다.
- 목표는 transcript 하단의 작업 바 한 축으로 더 압축하는 것입니다.

Runtime event density parity

claw-code reference: src/bridge/sessionRunner.ts, src/components/StatusNotices.tsx
AX gap:
- non-debug 기본 로그는 줄었지만, 일부 Cowork/Code 이벤트는 여전히 timeline을 자주 흔듭니다.
- 목표는 permission / tool / error / complete / paused / resumed를 더 안정된 event shape로 정규화하는 것입니다.

Settings Review

Remove candidate:
- PlanMode
  - current state: 사용자 노출 UI와 저장 경로는 off 고정으로 정리됐지만 AppSettings, SettingsViewModel, AppStateService 타입 잔재가 남아 있음
  - rationale: 현재 정책이 off 고정이라 사용자 선택값이 엔진에 의미 있게 기여하지 않음
- Code.EnablePlanModeTools
  - current state: UI/저장 경로와 기본값은 false 고정으로 정리됐지만 모델/설정 타입에 호환용 잔재가 남아 있음
  - rationale: 현재 엔진 정책에서 실제 실행 경로를 더 이상 바꾸지 않음
Move to developer-only candidate:
- FreeTierDelaySeconds
  - rationale: 일반 사용자가 조정할 이유가 적고 엔진 지연 정책에 직접 영향
- MaxAgentIterations
- MaxRetryOnError
  - rationale: 핵심 실행 루프 품질에 직접 영향하는 런타임 튜닝값
Keep as runtime-critical:
- OperationMode
- MaxContextTokens
- ContextCompactTriggerPercent
- EnableProactiveContextCompact
- EnableCoworkVerification
- EnableCodeVerification
- Code.EnableWorktreeTools / EnableTeamTools / EnableCronTools

Known UX / Performance Risks

Topic preset hover flicker was caused by duplicate hover systems:
- custom hover label
- default WPF ToolTip
AX fix:
- remove default ToolTip from topic cards and keep a single hover label path
Remaining runtime performance review targets:
- RefreshContextUsageVisual() frequency
- BuildTopicButtons() rebuild frequency
- OnAgentEvent timeline churn during long Cowork/Code runs
- compact queue summary still needs one more pass to fully match claw-code footer minimalism

22 KiB Raw Blame History

Claw Code Parity Plan (Rewritten)

Scope

Update

Preserved History (Summary)

Reference Map

AX Agent Improvement Phases

Phase A. Runtime State Canonicalization

Phase B. Prepared Execution Unification

Phase C. AgentLoop Event Normalization

Phase D. Timeline Render Parity

Phase E. Composer and Status Strip Simplification

Phase F. Recovery, Resume, and Verification

Execution Tracks

Done Criteria

Validation Matrix

Canonical Prompt Set

Tool / Skill Delta Snapshot

AX stronger areas

claw-code stronger areas

Remaining parity target

Transcript-First Approval / Ask UX

Tool / Skill UX Parity Follow-up

Current Snapshot

Remaining Gaps

Quality Uplift Plan

Track 1. Transcript Renderer Decomposition

Track 2. Permission Presentation Catalog

Track 3. Tool Result Message Taxonomy

Track 4. Plan Approval Transcript-Only Flow

Track 5. Runtime Summary Layer

Track 6. Regression Prompt Ritual

Recommended Execution Order

Settings and Logic Review

Settings Review

Known UX / Performance Risks

22 KiB

Raw Blame History