Files

Release Gate / gate (push) Has been cancelled

Details

AX Agent 도구·스킬 transcript 표현을 claw-code 기준으로 정리

- 도구/스킬 이벤트 배지를 역할 중심 라벨로 단순화
- raw snake_case 대신 사람이 읽기 쉬운 표시명과 /skill 형식 적용
- 작업 요약 팝업의 권한/배경 작업 관측성 섹션을 debug 수준으로 제한
- parity 문서와 개발 이력 문서에 tool/skill UX 정리 기준 반영

검증 결과
- dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\verify\ -p:IntermediateOutputPath=obj\verify\
- 경고 0 / 오류 0

2026-04-05 19:05:49 +09:00

16 KiB

Raw Blame History

Claw Code Parity Plan (Rewritten)

Scope

Align AX Copilot with claw-code quality for loop reliability, permission/hook behavior, and session durability.

Update

Updated: 2026-04-05 15:34 (KST)
Rebased the AX Agent improvement plan on actual claw-code runtime files instead of earlier AX snapshots. The reference spine is now src/bootstrap/state.ts -> src/bridge/initReplBridge.ts -> src/bridge/sessionRunner.ts -> src/screens/REPL.tsx -> src/components/Messages.tsx -> src/components/StatusLine.tsx.
AX Agent work should follow that same quality order: state first, execution second, render last. UI-only fixes that bypass state/execution should be treated as temporary.
Updated: 2026-04-05 16:55 (KST)
Current estimated parity vs claw-code: core execution engine 82%, main chat UI 68%, Cowork/Code status UX 63%, internal settings linkage 88%, overall AX Agent 74%.
Engine-affecting settings should be handled conservatively during parity work. If a setting changes the main execution route, approval flow, or recovery behavior without representing a stable real-world user choice, it should be moved to developer-only UI or removed from user-facing surfaces.

Preserved History (Summary)

Core loop guards and post-tool verification gates are already partially implemented.
Plan Mode, parallel tool execution, and unknown-tool recovery are in place.
Session restore hardening is ongoing.

Reference Map

claw-code reference	AX apply target	completion criteria	quality criteria
`src/bootstrap/state.ts`	`src/AxCopilot/Views/ChatWindow.xaml.cs`, `src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs`, `src/AxCopilot/Services/ChatStorageService.cs`	one canonical runtime/session state for current turn, queue, retry, execution events, and persisted snapshot	reopen/retry/queue flows do not create duplicate or blank assistant messages
`src/bridge/initReplBridge.ts`	`src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs`, `src/AxCopilot/Services/LlmService.cs`	send/regenerate/retry/queued follow-up/slash all enter through one prepared-execution path	same input under same settings takes same execution route regardless of entry point
`src/bridge/sessionRunner.ts`	`src/AxCopilot/Services/Agent/AgentLoopService.cs`, `src/AxCopilot/Services/Agent/AgentLoopTransitions.cs`, `src/AxCopilot/Services/Agent/AgentLoopTransitions.Execution.cs`	tool start/result/error/progress normalized once inside loop layer	Cowork/Code no longer flash repeated status strings or overshare debug payloads
`src/bridge/bridgeMessaging.ts`	`src/AxCopilot/Views/ChatWindow.xaml.cs`, `src/AxCopilot/Services/Agent/AgentLoopService.cs`	inbound execution events separated from display-only events before UI render	execution event replay does not duplicate visible timeline banners
`src/screens/REPL.tsx`	`src/AxCopilot/Views/ChatWindow.xaml`, `src/AxCopilot/Views/ChatWindow.xaml.cs`	screen state transitions, queue flow, retry flow, and composer state use shared runtime helpers	window resize, queue chaining, and retry feel stable instead of UI-patched
`src/components/Messages.tsx`	`src/AxCopilot/Views/ChatWindow.xaml.cs`	timeline derives from normalized conversation/session state only	no token-only completions, blank cards, or direct injected duplicates
`src/components/StatusLine.tsx`	`src/AxCopilot/Views/ChatWindow.xaml`, `src/AxCopilot/Views/ChatWindow.xaml.cs`	status strip computed from debounced runtime state, not multiple imperative refresh calls	metadata stays lightweight and does not overpower message timeline

AX Agent Improvement Phases

Phase A. Runtime State Canonicalization

Reference: src/bootstrap/state.ts
AX apply location: src/AxCopilot/Views/ChatWindow.xaml.cs, src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs, src/AxCopilot/Services/ChatStorageService.cs
Completion criteria:
- Chat, Cowork, Code all update one shared runtime/session state model.
- queue, retry, post-compaction, and execution-event state can be restored after reopen.
Quality criteria:
- reopening a conversation reproduces the same visible timeline without extra assistant cards.
- queue and execution badges remain in sync with the stored conversation.

Phase B. Prepared Execution Unification

Reference: src/bridge/initReplBridge.ts
AX apply location: src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs, src/AxCopilot/Services/LlmService.cs
Completion criteria:
- prompt stack assembly, execution mode choice, and final assistant commit are engine-owned.
- send/regenerate/retry/queued follow-up/slash flows all call the same preparation API.
Quality criteria:
- behavior is deterministic per tab/settings combination.
- UI stops building different prompt stacks for the same conversation state.

Phase C. AgentLoop Event Normalization

Reference: src/bridge/sessionRunner.ts, src/bridge/bridgeMessaging.ts
AX apply location: src/AxCopilot/Services/Agent/AgentLoopService.cs, src/AxCopilot/Services/Agent/AgentLoopTransitions.cs, src/AxCopilot/Services/Agent/AgentLoopTransitions.Execution.cs
Completion criteria:
- loop events are normalized into bounded activity/event records before UI consumption.
- permission requests, failure states, retries, and completion states use a stable event shape.
Quality criteria:
- Cowork/Code no longer flash rapidly during long-running tool sequences.
- file path/debug detail remains collapsed by default.

Phase D. Timeline Render Parity

Reference: src/screens/REPL.tsx, src/components/Messages.tsx
AX apply location: src/AxCopilot/Views/ChatWindow.xaml, src/AxCopilot/Views/ChatWindow.xaml.cs
Completion criteria:
- assistant/user messages, execution logs, compact boundaries, and queue summaries are rendered from one derived timeline model.
- direct imperative bubble injection is removed from normal send/regenerate/retry flows.
Quality criteria:
- no blank assistant cards.
- no token-only completion without visible content.
- no duplicate event banners after re-render.

Phase E. Composer and Status Strip Simplification

Reference: src/screens/REPL.tsx, src/components/StatusLine.tsx
AX apply location: src/AxCopilot/Views/ChatWindow.xaml, src/AxCopilot/Views/ChatWindow.xaml.cs
Completion criteria:
- composer height grows only on explicit line breaks.
- status strip, queue summary, and runtime activity all use debounced runtime updates.
- Chat/Cowork/Code share one responsive width calculation policy.
Quality criteria:
- resizing feels natural.
- composer does not keep growing after send.
- metadata remains subordinate to the message timeline.

Phase F. Recovery, Resume, and Verification

Reference: src/bootstrap/state.ts, src/bridge/sessionRunner.ts, src/screens/REPL.tsx
AX apply location: src/AxCopilot/Views/ChatWindow.xaml.cs, src/AxCopilot/Services/Agent/AxAgentExecutionEngine.cs, src/AxCopilot/Services/ChatStorageService.cs
Completion criteria:
- reopen after interruption keeps queue, runtime summary, and latest visible assistant state consistent.
- retry-last and regenerate do not depend on mutating InputBox.Text.
- all three tabs pass reopen/retry/manual compact/manual stop/manual resume scenarios.
Quality criteria:
- stored conversation and rendered conversation stay identical after restore.
- final reopened state matches the last completed runtime state.

Execution Tracks

Hook contract parity

Structured hook output support (updatedInput, updatedPermissions, additionalContext).
Runtime gating through settings toggles.

Session/state parity

Deterministic run resume rules.
Stable jsonl event schema + replay compatibility.

Recovery parity

Failure-type classification and standardized retry guidance.
Reduced repeated wrong-tool loops.

Completion parity

Evidence-based finalization criteria for code/document tasks.

Done Criteria

Internal parity scenarios pass target threshold.
Resume/replay failures: zero.
dotnet build warnings/errors: zero.

Validation Matrix

Build: dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\\verify\\ -p:IntermediateOutputPath=obj\\verify\\
Manual scenario 1: Chat send -> answer visible -> retry -> regenerate -> reopen conversation
Manual scenario 2: Cowork tool run -> progress summary -> completion -> queue next request -> reopen
Manual scenario 3: Code task with execution log noise -> completion -> compact -> next turn -> reopen
Manual scenario 4: AX Agent internal settings change -> immediate runtime reflection without layout regression

Canonical Prompt Set

Updated: 2026-04-05 22:04 (KST)
The following prompt set should be used for AX vs claw-code parity checks. The goal is not byte-identical output, but equivalent execution route, approval behavior, and artifact/result quality.

Chat basic answer

Prompt: 회의 일정 조정 메일을 정중한 한국어로 써줘
Apply to: Chat
Verify: normal reply render, retry/regenerate stability, reopen durability

Chat long-form explanation

Prompt: RAG와 fine-tuning 차이를 실무 관점으로 7가지로 설명해줘
Apply to: Chat
Verify: long response rendering, compaction follow-up continuity

Cowork document task

Prompt: 신규 ERP 도입 제안서 초안을 작성해줘. 목적, 범위, 기대효과, 추진일정 포함
Apply to: Cowork
Verify: topic/task preset routing, plan-first execution, actual document-oriented output path

Cowork data task

Prompt: 매출 CSV를 분석해서 월별 추세와 이상치를 요약해줘
Apply to: Cowork
Verify: data-analysis tool choice, reduced runtime noise, final summary quality

Code bug-fix task

Prompt: 현재 프로젝트에서 설정 저장 버그 원인 찾고 수정해줘
Apply to: Code
Verify: read/search/edit path, diff persistence, reopen consistency

Code build/test task

Prompt: 빌드 오류를 재현하고 수정한 뒤 다시 빌드해줘
Apply to: Code
Verify: build/test loop, failure retry, final completion message

Queued follow-up

Prompt sequence:
- 이 창 레이아웃 문제 원인 찾아줘
- 끝나면 README도 같이 갱신해줘
Apply to: Cowork, Code
Verify: queue chaining, next-turn pickup without UI mutation

Post-compaction continuity

Prompt: 지금까지 논의한 내용을 5줄로 이어서 정리하고 다음 작업 제안해줘
Apply to: Chat, Cowork, Code
Verify: compact-after-next-turn continuity, no token-only completion

Permission approval

Prompt: 이 파일을 수정해서 저장해줘
Apply to: Code
Verify: permission request, approve/reject rendering, final transcript consistency

Slash / skill entry

Prompt: /bug-hunt src 폴더 잠재 버그 찾아줘
Apply to: Code
Verify: slash entry uses the same prepared-execution route as normal send

Tool / Skill Delta Snapshot

Updated: 2026-04-05 22:04 (KST)
AX tool registry count is larger than claw-code, but the shape is different.
AX reference: src/AxCopilot/Services/Agent/ToolRegistry.cs
claw-code reference: src/tools/*, src/skills/bundledSkills.ts

AX stronger areas

Document/office generation and conversion (ExcelSkill, DocxSkill, PptxSkill, DocumentPlannerTool, DocumentAssemblerTool)
Data/business utilities (DataPivotTool, SqlTool, FormatConvertTool, TextSummarizeTool)
WPF-integrated enterprise UX and Korean workflow presets

claw-code stronger areas

Transcript-native tool use / rejection / approval message taxonomy
Plan approval request/response rendering in the message stream
Permission and tool-result message consistency
Bundled skill registry and skill message integration

Remaining parity target

Keep AX's richer business/document tool set
Bring transcript rendering and approval/status UX closer to claw-code

Transcript-First Approval / Ask UX

Updated: 2026-04-05 18:58 (KST)
plan approval and user ask should both resolve inside the transcript first.
Secondary windows are allowed only as detail surfaces, not as the primary decision flow.
AX implementation status:
- plan approval: transcript-first, detail view via PlanViewerWindow
- user ask: transcript-first inline question card with choices / direct input / submit

Tool / Skill UX Parity Follow-up

Updated: 2026-04-05 19:04 (KST)
Default transcript should prefer role-oriented badges and readable labels over raw internal tool names.
AX implementation status:
- tool event badges: simplified to role-first labels
- item naming: normalized into readable Korean labels or /skill-name style
- observability panels: permission/background diagnostics reduced outside debug mode
Remaining quality target:
- move more tool-result and permission-result presentation into smaller message-type-specific helpers, closer to claw-code component separation

Current Snapshot

Updated: 2026-04-05 19:42 (KST)
Estimated parity:
- Core engine: 89%
- Main transcript UI: 96%
- Cowork/Code runtime UX: 92%
- Internal settings linkage: 88%
- Overall AX Agent parity: 93%

Remaining Gaps

Prompt lifecycle parity

claw-code reference: src/utils/handlePromptSubmit.ts, src/utils/processUserInput/processTextPrompt.ts
AX gap:
- send / retry / regenerate are mostly unified, but slash / compact 후 다음 턴 / 일부 queue 후처리는 아직 ChatWindow.xaml.cs에서 UI 상태를 먼저 만지는 구간이 남아 있습니다.
- 목표는 모든 입력 진입점이 AxAgentExecutionEngine의 동일한 prepare/execute/finalize 축만 타게 만드는 것입니다.

Plan / approval rendering parity

claw-code reference: src/components/messages/PlanApprovalMessage.tsx
AX gap:
- 기본 transcript에서는 compact pill 위주로 줄였지만, 승인/계획 결과 표현이 아직 Popup/Window + WPF 카드와 섞여 있습니다.
- 목표는 “본문 우선 + 필요 시 열기” 기준으로 더 단일한 timeline 언어로 수렴시키는 것입니다.

Status line / composer parity

claw-code reference: src/components/StatusLine.tsx, src/components/PromptInput/PromptInput.tsx
AX gap:
- 하단 상태바와 composer 옵션은 많이 줄었지만, 상태 메타가 여전히 분산돼 있고 일부 토글/빠른 설정이 별도 행으로 남아 있습니다.
- 목표는 transcript 하단의 작업 바 한 축으로 더 압축하는 것입니다.

Runtime event density parity

claw-code reference: src/bridge/sessionRunner.ts, src/components/StatusNotices.tsx
AX gap:
- non-debug 기본 로그는 줄었지만, 일부 Cowork/Code 이벤트는 여전히 timeline을 자주 흔듭니다.
- 목표는 permission / tool / error / complete / paused / resumed를 더 안정된 event shape로 정규화하는 것입니다.

Settings Review

Remove candidate:
- PlanMode
  - current state: 사용자 노출 UI와 저장 경로는 off 고정으로 정리됐지만 AppSettings, SettingsViewModel, AppStateService 타입 잔재가 남아 있음
  - rationale: 현재 정책이 off 고정이라 사용자 선택값이 엔진에 의미 있게 기여하지 않음
- Code.EnablePlanModeTools
  - current state: UI/저장 경로와 기본값은 false 고정으로 정리됐지만 모델/설정 타입에 호환용 잔재가 남아 있음
  - rationale: 현재 엔진 정책에서 실제 실행 경로를 더 이상 바꾸지 않음
Move to developer-only candidate:
- FreeTierDelaySeconds
  - rationale: 일반 사용자가 조정할 이유가 적고 엔진 지연 정책에 직접 영향
- MaxAgentIterations
- MaxRetryOnError
  - rationale: 핵심 실행 루프 품질에 직접 영향하는 런타임 튜닝값
Keep as runtime-critical:
- OperationMode
- MaxContextTokens
- ContextCompactTriggerPercent
- EnableProactiveContextCompact
- EnableCoworkVerification
- EnableCodeVerification
- Code.EnableWorktreeTools / EnableTeamTools / EnableCronTools

Known UX / Performance Risks

Topic preset hover flicker was caused by duplicate hover systems:
- custom hover label
- default WPF ToolTip
AX fix:
- remove default ToolTip from topic cards and keep a single hover label path
Remaining runtime performance review targets:
- RefreshContextUsageVisual() frequency
- BuildTopicButtons() rebuild frequency
- OnAgentEvent timeline churn during long Cowork/Code runs
- compact queue summary still needs one more pass to fully match claw-code footer minimalism

16 KiB Raw Blame History

Claw Code Parity Plan (Rewritten)

Scope

Update

Preserved History (Summary)

Reference Map

AX Agent Improvement Phases

Phase A. Runtime State Canonicalization

Phase B. Prepared Execution Unification

Phase C. AgentLoop Event Normalization

Phase D. Timeline Render Parity

Phase E. Composer and Status Strip Simplification

Phase F. Recovery, Resume, and Verification

Execution Tracks

Done Criteria

Validation Matrix

Canonical Prompt Set

Tool / Skill Delta Snapshot

AX stronger areas

claw-code stronger areas

Remaining parity target

Transcript-First Approval / Ask UX

Tool / Skill UX Parity Follow-up

Current Snapshot

Remaining Gaps

Settings Review

Known UX / Performance Risks

16 KiB

Raw Blame History