AX Agent 계획 승인 흐름을 transcript 우선으로 전환하고 claw-code 비교 기준을 문서화
- claw-code 대비 canonical prompt set 10종을 parity 문서에 추가해 Chat/Cowork/Code 회귀 검증 기준을 고정함 - AX와 claw-code의 도구/스킬 차이를 문서에 정리해 남은 parity 목표를 명확히 함 - PlanViewerWindow를 즉시 띄우지 않고 inline 승인/수정/취소 버튼을 transcript에 먼저 노출하도록 계획 승인 흐름을 변경함 - PlanViewerWindow는 하단 계획 버튼으로 여는 보조 상세 보기 역할로 축소함 - 검증: dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\verify\ -p:IntermediateOutputPath=obj\verify\ (경고 0 / 오류 0)
This commit is contained in:
@@ -4745,3 +4745,6 @@ ow + toggle ?쒓컖 ?몄뼱濡??ㅼ떆 ?뺣젹?덈떎.
|
||||
- 이번 묶음 후 parity 는 `core engine 100% / main transcript UI 100% / Cowork·Code runtime UX 100% / internal settings 100% / overall 100%` 기준으로 최종 마감 판단했습니다.
|
||||
- 검증: `dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\verify\ -p:IntermediateOutputPath=obj\verify\` 경고 0 / 오류 0
|
||||
- 업데이트: 2026-04-05 21:43 (KST)
|
||||
- Document update: 2026-04-05 22:04 (KST) - Added a canonical 10-prompt regression set to `docs/claw-code-parity-plan.md` so AX Agent and `claw-code` can be compared on the same Chat/Cowork/Code scenarios: basic/long chat, document/data cowork, bug-fix/build code, queued follow-up, post-compaction continuity, permission approval, and slash skill entry.
|
||||
- Document update: 2026-04-05 22:04 (KST) - Added a tool/skill delta snapshot to the parity plan. AX remains stronger on document/office/data workflows, while `claw-code` remains stronger on transcript-native approval/tool-result/permission message taxonomy.
|
||||
- Document update: 2026-04-05 22:04 (KST) - Switched plan approval flow to transcript-first. `CreatePlanDecisionCallback()` now prepares `PlanViewerWindow` without auto-opening it, shows the inline approval controls in the transcript first, and keeps the bottom `계획` button as the secondary detail surface.
|
||||
|
||||
@@ -122,6 +122,83 @@
|
||||
- Manual scenario 3: Code task with execution log noise -> completion -> compact -> next turn -> reopen
|
||||
- Manual scenario 4: AX Agent internal settings change -> immediate runtime reflection without layout regression
|
||||
|
||||
## Canonical Prompt Set
|
||||
- Updated: 2026-04-05 22:04 (KST)
|
||||
- The following prompt set should be used for AX vs `claw-code` parity checks. The goal is not byte-identical output, but equivalent execution route, approval behavior, and artifact/result quality.
|
||||
|
||||
1. Chat basic answer
|
||||
- Prompt: `회의 일정 조정 메일을 정중한 한국어로 써줘`
|
||||
- Apply to: `Chat`
|
||||
- Verify: normal reply render, retry/regenerate stability, reopen durability
|
||||
|
||||
2. Chat long-form explanation
|
||||
- Prompt: `RAG와 fine-tuning 차이를 실무 관점으로 7가지로 설명해줘`
|
||||
- Apply to: `Chat`
|
||||
- Verify: long response rendering, compaction follow-up continuity
|
||||
|
||||
3. Cowork document task
|
||||
- Prompt: `신규 ERP 도입 제안서 초안을 작성해줘. 목적, 범위, 기대효과, 추진일정 포함`
|
||||
- Apply to: `Cowork`
|
||||
- Verify: topic/task preset routing, plan-first execution, actual document-oriented output path
|
||||
|
||||
4. Cowork data task
|
||||
- Prompt: `매출 CSV를 분석해서 월별 추세와 이상치를 요약해줘`
|
||||
- Apply to: `Cowork`
|
||||
- Verify: data-analysis tool choice, reduced runtime noise, final summary quality
|
||||
|
||||
5. Code bug-fix task
|
||||
- Prompt: `현재 프로젝트에서 설정 저장 버그 원인 찾고 수정해줘`
|
||||
- Apply to: `Code`
|
||||
- Verify: read/search/edit path, diff persistence, reopen consistency
|
||||
|
||||
6. Code build/test task
|
||||
- Prompt: `빌드 오류를 재현하고 수정한 뒤 다시 빌드해줘`
|
||||
- Apply to: `Code`
|
||||
- Verify: build/test loop, failure retry, final completion message
|
||||
|
||||
7. Queued follow-up
|
||||
- Prompt sequence:
|
||||
- `이 창 레이아웃 문제 원인 찾아줘`
|
||||
- `끝나면 README도 같이 갱신해줘`
|
||||
- Apply to: `Cowork`, `Code`
|
||||
- Verify: queue chaining, next-turn pickup without UI mutation
|
||||
|
||||
8. Post-compaction continuity
|
||||
- Prompt: `지금까지 논의한 내용을 5줄로 이어서 정리하고 다음 작업 제안해줘`
|
||||
- Apply to: `Chat`, `Cowork`, `Code`
|
||||
- Verify: compact-after-next-turn continuity, no token-only completion
|
||||
|
||||
9. Permission approval
|
||||
- Prompt: `이 파일을 수정해서 저장해줘`
|
||||
- Apply to: `Code`
|
||||
- Verify: permission request, approve/reject rendering, final transcript consistency
|
||||
|
||||
10. Slash / skill entry
|
||||
- Prompt: `/bug-hunt src 폴더 잠재 버그 찾아줘`
|
||||
- Apply to: `Code`
|
||||
- Verify: slash entry uses the same prepared-execution route as normal send
|
||||
|
||||
## Tool / Skill Delta Snapshot
|
||||
- Updated: 2026-04-05 22:04 (KST)
|
||||
- AX tool registry count is larger than `claw-code`, but the shape is different.
|
||||
- AX reference: `src/AxCopilot/Services/Agent/ToolRegistry.cs`
|
||||
- `claw-code` reference: `src/tools/*`, `src/skills/bundledSkills.ts`
|
||||
|
||||
### AX stronger areas
|
||||
- Document/office generation and conversion (`ExcelSkill`, `DocxSkill`, `PptxSkill`, `DocumentPlannerTool`, `DocumentAssemblerTool`)
|
||||
- Data/business utilities (`DataPivotTool`, `SqlTool`, `FormatConvertTool`, `TextSummarizeTool`)
|
||||
- WPF-integrated enterprise UX and Korean workflow presets
|
||||
|
||||
### claw-code stronger areas
|
||||
- Transcript-native tool use / rejection / approval message taxonomy
|
||||
- Plan approval request/response rendering in the message stream
|
||||
- Permission and tool-result message consistency
|
||||
- Bundled skill registry and skill message integration
|
||||
|
||||
### Remaining parity target
|
||||
- Keep AX's richer business/document tool set
|
||||
- Bring transcript rendering and approval/status UX closer to `claw-code`
|
||||
|
||||
## Current Snapshot
|
||||
- Updated: 2026-04-05 19:42 (KST)
|
||||
- Estimated parity:
|
||||
|
||||
Reference in New Issue
Block a user