AX Agent 계획 승인 흐름을 transcript 우선으로 전환하고 claw-code 비교 기준을 문서화

- claw-code 대비 canonical prompt set 10종을 parity 문서에 추가해 Chat/Cowork/Code 회귀 검증 기준을 고정함 - AX와 claw-code의 도구/스킬 차이를 문서에 정리해 남은 parity 목표를 명확히 함 - PlanViewerWindow를 즉시 띄우지 않고 inline 승인/수정/취소 버튼을 transcript에 먼저 노출하도록 계획 승인 흐름을 변경함 - PlanViewerWindow는 하단 계획 버튼으로 여는 보조 상세 보기 역할로 축소함 - 검증: dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\verify\ -p:IntermediateOutputPath=obj\verify\ (경고 0 / 오류 0)
2026-04-05 18:53:28 +09:00
parent a3b3522bb7
commit 6c5b0c5be3
5 changed files with 106 additions and 16 deletions
--- a/docs/DEVELOPMENT.md
+++ b/docs/DEVELOPMENT.md
@@ -4745,3 +4745,6 @@ ow + toggle ?쒓컖 ?몄뼱濡??ㅼ떆 ?뺣젹?덈떎.
 - 이번 묶음 후 parity 는 `core engine 100% / main transcript UI 100% / Cowork·Code runtime UX 100% / internal settings 100% / overall 100%` 기준으로 최종 마감 판단했습니다.
 - 검증: `dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\verify\ -p:IntermediateOutputPath=obj\verify\` 경고 0 / 오류 0
 - 업데이트: 2026-04-05 21:43 (KST)
+- Document update: 2026-04-05 22:04 (KST) - Added a canonical 10-prompt regression set to `docs/claw-code-parity-plan.md` so AX Agent and `claw-code` can be compared on the same Chat/Cowork/Code scenarios: basic/long chat, document/data cowork, bug-fix/build code, queued follow-up, post-compaction continuity, permission approval, and slash skill entry.
+- Document update: 2026-04-05 22:04 (KST) - Added a tool/skill delta snapshot to the parity plan. AX remains stronger on document/office/data workflows, while `claw-code` remains stronger on transcript-native approval/tool-result/permission message taxonomy.
+- Document update: 2026-04-05 22:04 (KST) - Switched plan approval flow to transcript-first. `CreatePlanDecisionCallback()` now prepares `PlanViewerWindow` without auto-opening it, shows the inline approval controls in the transcript first, and keeps the bottom `계획` button as the secondary detail surface.
--- a/docs/claw-code-parity-plan.md
+++ b/docs/claw-code-parity-plan.md
@@ -122,6 +122,83 @@
 - Manual scenario 3: Code task with execution log noise -> completion -> compact -> next turn -> reopen
 - Manual scenario 4: AX Agent internal settings change -> immediate runtime reflection without layout regression

+## Canonical Prompt Set
+- Updated: 2026-04-05 22:04 (KST)
+- The following prompt set should be used for AX vs `claw-code` parity checks. The goal is not byte-identical output, but equivalent execution route, approval behavior, and artifact/result quality.
+
+1. Chat basic answer
+- Prompt: `회의 일정 조정 메일을 정중한 한국어로 써줘`
+- Apply to: `Chat`
+- Verify: normal reply render, retry/regenerate stability, reopen durability
+
+2. Chat long-form explanation
+- Prompt: `RAG와 fine-tuning 차이를 실무 관점으로 7가지로 설명해줘`
+- Apply to: `Chat`
+- Verify: long response rendering, compaction follow-up continuity
+
+3. Cowork document task
+- Prompt: `신규 ERP 도입 제안서 초안을 작성해줘. 목적, 범위, 기대효과, 추진일정 포함`
+- Apply to: `Cowork`
+- Verify: topic/task preset routing, plan-first execution, actual document-oriented output path
+
+4. Cowork data task
+- Prompt: `매출 CSV를 분석해서 월별 추세와 이상치를 요약해줘`
+- Apply to: `Cowork`
+- Verify: data-analysis tool choice, reduced runtime noise, final summary quality
+
+5. Code bug-fix task
+- Prompt: `현재 프로젝트에서 설정 저장 버그 원인 찾고 수정해줘`
+- Apply to: `Code`
+- Verify: read/search/edit path, diff persistence, reopen consistency
+
+6. Code build/test task
+- Prompt: `빌드 오류를 재현하고 수정한 뒤 다시 빌드해줘`
+- Apply to: `Code`
+- Verify: build/test loop, failure retry, final completion message
+
+7. Queued follow-up
+- Prompt sequence:
+  - `이 창 레이아웃 문제 원인 찾아줘`
+  - `끝나면 README도 같이 갱신해줘`
+- Apply to: `Cowork`, `Code`
+- Verify: queue chaining, next-turn pickup without UI mutation
+
+8. Post-compaction continuity
+- Prompt: `지금까지 논의한 내용을 5줄로 이어서 정리하고 다음 작업 제안해줘`
+- Apply to: `Chat`, `Cowork`, `Code`
+- Verify: compact-after-next-turn continuity, no token-only completion
+
+9. Permission approval
+- Prompt: `이 파일을 수정해서 저장해줘`
+- Apply to: `Code`
+- Verify: permission request, approve/reject rendering, final transcript consistency
+
+10. Slash / skill entry
+- Prompt: `/bug-hunt src 폴더 잠재 버그 찾아줘`
+- Apply to: `Code`
+- Verify: slash entry uses the same prepared-execution route as normal send
+
+## Tool / Skill Delta Snapshot
+- Updated: 2026-04-05 22:04 (KST)
+- AX tool registry count is larger than `claw-code`, but the shape is different.
+- AX reference: `src/AxCopilot/Services/Agent/ToolRegistry.cs`
+- `claw-code` reference: `src/tools/*`, `src/skills/bundledSkills.ts`
+
+### AX stronger areas
+- Document/office generation and conversion (`ExcelSkill`, `DocxSkill`, `PptxSkill`, `DocumentPlannerTool`, `DocumentAssemblerTool`)
+- Data/business utilities (`DataPivotTool`, `SqlTool`, `FormatConvertTool`, `TextSummarizeTool`)
+- WPF-integrated enterprise UX and Korean workflow presets
+
+### claw-code stronger areas
+- Transcript-native tool use / rejection / approval message taxonomy
+- Plan approval request/response rendering in the message stream
+- Permission and tool-result message consistency
+- Bundled skill registry and skill message integration
+
+### Remaining parity target
+- Keep AX's richer business/document tool set
+- Bring transcript rendering and approval/status UX closer to `claw-code`
+
 ## Current Snapshot
 - Updated: 2026-04-05 19:42 (KST)
 - Estimated parity: