코워크 문서 생성 게이트와 코드 후속 검증 게이트를 claw-code 기준으로 경량화

- balanced/tool_call_strict 프로필의 document_plan 재시도와 공격적 문서 fallback 개입을 줄여 Cowork 루프를 더 얇게 정리함 - document_plan 성공 직후 강제 user follow-up 주입을 제거하고 terminal 문서 도구 성공 시 Cowork에서 바로 종료할 수 있게 조정함 - CodeDiffGate, RecentExecutionGate, ExecutionSuccessGate를 review 작업 중심으로 제한해 일반 코드 수정의 과검증을 완화함 - TaskTypePolicy, SystemPromptBuilder, cowork preset을 함께 맞춰 문서 생성/분석형 요청의 종료 조건을 일관되게 정리함 - README.md 및 docs/DEVELOPMENT.md를 2026-04-12 23:05 (KST) 기준으로 갱신함 - 검증: dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\\verify\\ -p:IntermediateOutputPath=obj\\verify\\ (경고 0, 오류 0)
2026-04-12 22:15:26 +09:00
parent fb0bea41f7
commit 4db75d46cd
10 changed files with 86 additions and 54 deletions
--- a/src/AxCopilot/Services/Agent/AgentLoopService.cs
+++ b/src/AxCopilot/Services/Agent/AgentLoopService.cs
@@ -600,7 +600,10 @@ public partial class AgentLoopService
                    }

                    // ── 2단계: document_plan 직접 생성 폴백 ──
-                    if (documentPlanCalled && !string.IsNullOrEmpty(documentPlanScaffold) && !_docFallbackAttempted)
+                    if (executionPolicy.PreferAggressiveDocumentFallback
+                        && documentPlanCalled
+                        && !string.IsNullOrEmpty(documentPlanScaffold)
+                        && !_docFallbackAttempted)
                    {
                        _docFallbackAttempted = true;
                        EmitEvent(AgentEventType.Thinking, "", "앱에서 직접 문서를 생성합니다...");
@@ -870,6 +873,7 @@ public partial class AgentLoopService

                    // document_plan은 호출됐지만 terminal 문서 도구(html_create 등)가 미호출인 경우 → 프로파일 기준 재시도
                    if (requiresConcreteArtifactOrEdit
+                        && executionPolicy.DocumentPlanRetryMax > 0
                        && documentPlanCalled && postDocumentPlanRetry < documentPlanRetryMax)
                    {
                        postDocumentPlanRetry++;
@@ -886,6 +890,7 @@ public partial class AgentLoopService

                    // 재시도도 모두 소진 → 앱이 직접 본문 생성 후 html_create 강제 실행
                    if (requiresConcreteArtifactOrEdit
+                        && executionPolicy.PreferAggressiveDocumentFallback
                        && documentPlanCalled && !string.IsNullOrEmpty(documentPlanScaffold) && !_docFallbackAttempted)
                    {
                        _docFallbackAttempted = true;
@@ -949,7 +954,8 @@ public partial class AgentLoopService

                    // LLM이 도구를 한 번도 호출하지 않고 텍스트만 반환 + 문서 생성 요청이면 → 앱이 직접 HTML 파일로 저장
                    // 주의: 이미 도구가 실행된 경우(totalToolCalls > 0)에는 폴백하지 않음 (중복 파일 방지)
-                    if (!_docFallbackAttempted && totalToolCalls == 0
+                    if (executionPolicy.PreferAggressiveDocumentFallback
+                        && !_docFallbackAttempted && totalToolCalls == 0
                        && !string.IsNullOrEmpty(textResponse)
                        && IsDocumentCreationRequest(userQuery))
                    {
@@ -986,6 +992,7 @@ public partial class AgentLoopService
                    if (TryApplyCodeDiffEvidenceGateTransition(
                            messages,
                            textResponse,
+                            taskPolicy,
                            runState,
                            executionPolicy))
                        continue;
--- a/src/AxCopilot/Services/Agent/AgentLoopTransitions.Documents.cs
+++ b/src/AxCopilot/Services/Agent/AgentLoopTransitions.Documents.cs
@@ -28,16 +28,7 @@ public partial class AgentLoopService
            return;

        var toolHint = ResolveDocumentPlanFollowUpTool(po);
-        messages.Add(new ChatMessage
-        {
-            Role = "user",
-            Content =
-                "document_plan이 완료되었습니다. " +
-                "방금 생성된 골격의 [내용...] 자리와 각 섹션 내용을 실제 상세 본문으로 모두 채운 뒤 " +
-                $"{toolHint} 도구를 지금 즉시 호출하세요. " +
-                "설명만 하지 말고 실제 문서 생성 도구 호출로 바로 이어가세요."
-        });
-        EmitEvent(AgentEventType.Thinking, "", $"문서 개요 완료 · {toolHint} 실행 유도");
+        EmitEvent(AgentEventType.Thinking, "", $"Document structure ready · next creation candidate {toolHint}");
    }

    private static string? ExtractDocumentPlanScaffold(string output)
@@ -47,7 +38,6 @@ public partial class AgentLoopService

        var markers = new (string Start, string End)[]
        {
-            ("--- body 시작 ---", "--- body 끝 ---"),
            ("--- body start ---", "--- body end ---"),
            ("<!-- body start marker -->", "<!-- body end marker -->"),
        };
@@ -76,10 +66,10 @@ public partial class AgentLoopService
        if (string.IsNullOrWhiteSpace(output))
            return false;

-        return output.Contains("즉시 실행", StringComparison.OrdinalIgnoreCase)
-               || output.Contains("immediate next step", StringComparison.OrdinalIgnoreCase)
+        return output.Contains("immediate next step", StringComparison.OrdinalIgnoreCase)
               || output.Contains("call html_create", StringComparison.OrdinalIgnoreCase)
-               || output.Contains("call document_assemble", StringComparison.OrdinalIgnoreCase);
+               || output.Contains("call document_assemble", StringComparison.OrdinalIgnoreCase)
+               || output.Contains("call docx_create", StringComparison.OrdinalIgnoreCase);
    }

    private static string ResolveDocumentPlanFollowUpTool(string output)
@@ -109,11 +99,6 @@ public partial class AgentLoopService
        if (!result.Success || !IsTerminalDocumentTool(call.ToolName) || toolCalls.Count != 1)
            return (false, false);

-        // document_plan 없이 바로 문서 도구가 호출된 경우 — 아직 LLM이 추가 반복을 할 수 있음.
-        // 한 번에 생성된 문서는 내용이 부실할 수 있으므로 조기 종료하지 않고 LLM에 판단을 맡긴다.
-        if (!_docFallbackAttempted && !documentPlanWasCalled)
-            return (false, false);
-
        if (!string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
        {
            EmitEvent(AgentEventType.Complete, "", "에이전트 작업 완료");
@@ -165,15 +150,12 @@ public partial class AgentLoopService
        if (!string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
            return false;

-        if (string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
-        {
-            var highImpactCodeChange = IsHighImpactCodeModification(ActiveTab ?? "", call.ToolName, result);
-            var hasDiffEvidence = HasDiffEvidenceAfterLastModification(messages);
-            var hasRecentBuildOrTestEvidence = HasBuildOrTestEvidenceAfterLastModification(messages);
+        var highImpactCodeChange = IsHighImpactCodeModification(ActiveTab ?? "", call.ToolName, result);
+        var hasDiffEvidence = HasDiffEvidenceAfterLastModification(messages);
+        var hasRecentBuildOrTestEvidence = HasBuildOrTestEvidenceAfterLastModification(messages);

-            if (!highImpactCodeChange || (hasDiffEvidence && hasRecentBuildOrTestEvidence))
-                return false;
-        }
+        if (!highImpactCodeChange || (hasDiffEvidence && hasRecentBuildOrTestEvidence))
+            return false;

        await RunPostToolVerificationAsync(messages, call.ToolName, result, context, ct);
        return true;
--- a/src/AxCopilot/Services/Agent/AgentLoopTransitions.Verification.cs
+++ b/src/AxCopilot/Services/Agent/AgentLoopTransitions.Verification.cs
@@ -123,12 +123,16 @@ public partial class AgentLoopService
    private bool TryApplyCodeDiffEvidenceGateTransition(
        List<ChatMessage> messages,
        string? textResponse,
+        TaskTypePolicy taskPolicy,
        RunState runState,
        ModelExecutionProfileCatalog.ExecutionPolicy executionPolicy)
    {
        if (!string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
            return false;

+        if (!taskPolicy.IsReviewTask)
+            return false;
+
        if (executionPolicy.CodeDiffGateMaxRetries <= 0 || runState.CodeDiffGateRetry >= executionPolicy.CodeDiffGateMaxRetries)
            return false;

@@ -157,6 +161,9 @@ public partial class AgentLoopService
        if (!string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
            return false;

+        if (!taskPolicy.IsReviewTask)
+            return false;
+
        if (executionPolicy.RecentExecutionGateMaxRetries <= 0 || runState.RecentExecutionGateRetry >= executionPolicy.RecentExecutionGateMaxRetries)
            return false;

@@ -191,6 +198,9 @@ public partial class AgentLoopService
        if (!string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
            return false;

+        if (!taskPolicy.IsReviewTask)
+            return false;
+
        if (executionPolicy.ExecutionSuccessGateMaxRetries <= 0 || runState.ExecutionSuccessGateRetry >= executionPolicy.ExecutionSuccessGateMaxRetries)
            return false;

--- a/src/AxCopilot/Services/Agent/ModelExecutionProfileCatalog.cs
+++ b/src/AxCopilot/Services/Agent/ModelExecutionProfileCatalog.cs
@@ -58,8 +58,8 @@ public static class ModelExecutionProfileCatalog
                NoToolResponseThreshold: 1,
                NoToolRecoveryMaxRetries: 4,  // IBM/Qwen 등 chatty 모델: 재시도 횟수 늘려 도구 호출 강제
                PlanExecutionRetryMax: 2,
-                DocumentPlanRetryMax: 2,
-                PreferAggressiveDocumentFallback: true,
+                DocumentPlanRetryMax: 0,
+                PreferAggressiveDocumentFallback: false,
                ReduceEarlyMemoryPressure: true,
                EnablePostToolVerification: false,
                EnableCodeQualityGates: true,
@@ -154,12 +154,12 @@ public static class ModelExecutionProfileCatalog
                "balanced",
                "균형",
                ForceInitialToolCall: true,
-                ForceToolCallAfterPlan: true,
+                ForceToolCallAfterPlan: false,
                ToolTemperatureCap: 0.35,
                NoToolResponseThreshold: 2,
                NoToolRecoveryMaxRetries: 2,
                PlanExecutionRetryMax: 2,
-                DocumentPlanRetryMax: 2,
+                DocumentPlanRetryMax: 0,
                PreferAggressiveDocumentFallback: false,
                ReduceEarlyMemoryPressure: false,
                EnablePostToolVerification: false,
--- a/src/AxCopilot/Services/Agent/TaskTypePolicy.cs
+++ b/src/AxCopilot/Services/Agent/TaskTypePolicy.cs
@@ -23,10 +23,10 @@ internal sealed class TaskTypePolicy
                GuidanceMessage =
                    "[System:TaskType] This is a bug-fix task. Prioritize reproduction evidence, root cause linkage, smallest safe fix, and regression verification. " +
                    "Preferred tool order: targeted file_read or grep/glob/lsp -> file_edit -> build_run/test_loop as needed -> git_tool(diff) when it helps confirm the final change.",
-                FailurePatternFocus = "재현 조건과 원인 연결을 먼저 확인하세요. Check reproduction conditions and root-cause linkage first.",
-                FollowUpTaskLine = "작업 유형: bugfix. Task type: bugfix. Verify the fix is directly linked to the symptom and confirm non-regression.\n",
-                FailureInvestigationTaskLine = "추가 점검: 재현 조건 기준으로 증상이 재현되지 않는지와 원인 연결이 타당한지 확인하세요. Extra check: confirm the symptom is no longer reproducible and root-cause linkage is valid.\n",
-                FinalReportTaskLine = "버그 수정은 원인, 수정 내용, 재현/회귀 검증 근거를 포함하세요. For bug fixes, include root cause, change summary, and reproduction/regression evidence.\n",
+                FailurePatternFocus = "Check reproduction conditions and root-cause linkage first.",
+                FollowUpTaskLine = "Task type: bugfix. Verify the fix is directly linked to the symptom and confirm non-regression.\n",
+                FailureInvestigationTaskLine = "Extra check: confirm the symptom is no longer reproducible and root-cause linkage is valid.\n",
+                FinalReportTaskLine = "For bug fixes, include root cause, change summary, and reproduction/regression evidence.\n",
            },
            "feature" => new TaskTypePolicy
            {
@@ -36,8 +36,8 @@ internal sealed class TaskTypePolicy
                    "Preferred tool order: targeted file_read or grep/glob/lsp -> file_edit/file_write -> build_run/test_loop as needed -> git_tool(diff). " +
                    "Use folder_map only when the user explicitly needs folder structure or file listing.",
                FailurePatternFocus = "Check new behavior flow and caller linkage first.",
-                FollowUpTaskLine = "작업 유형: feature. Task type: feature. Verify behavior flow, input/output path, caller impact, and test additions.\n",
-                FailureInvestigationTaskLine = "추가 점검: 새 기능 경로와 호출부 연결이 의도대로 동작하는지 확인하세요. Extra check: confirm feature path and caller linkage behave as intended.\n",
+                FollowUpTaskLine = "Task type: feature. Verify behavior flow, input/output path, caller impact, and test additions.\n",
+                FailureInvestigationTaskLine = "Extra check: confirm feature path and caller linkage behave as intended.\n",
                FinalReportTaskLine = "For features, include behavior flow, impacted files/callers, and verification evidence.\n",
            },
            "refactor" => new TaskTypePolicy
@@ -47,8 +47,8 @@ internal sealed class TaskTypePolicy
                    "[System:TaskType] This is a refactor task. Prioritize behavior preservation, reference impact, diff review, and non-regression evidence. " +
                    "Preferred tool order: targeted file_read or grep/glob/lsp -> file_edit -> build_run/test_loop as needed -> git_tool(diff).",
                FailurePatternFocus = "Check behavior preservation and impact scope first.",
-                FollowUpTaskLine = "작업 유형: refactor. Task type: refactor. Prioritize behavior-preservation evidence over cosmetic cleanup.\n",
-                FailureInvestigationTaskLine = "추가 점검: 동작 보존 관점에서 기존 호출 흐름이 동일하게 유지되는지 확인하세요. Extra check: validate existing call flow remains behavior-compatible.\n",
+                FollowUpTaskLine = "Task type: refactor. Prioritize behavior-preservation evidence over cosmetic cleanup.\n",
+                FailureInvestigationTaskLine = "Extra check: validate existing call flow remains behavior-compatible.\n",
                FinalReportTaskLine = "For refactors, include behavior-preservation evidence and impact scope.\n",
            },
            "review" => new TaskTypePolicy
@@ -59,8 +59,8 @@ internal sealed class TaskTypePolicy
                    "Report findings with P0-P3 severity and file evidence, then separate Fixed vs Unfixed status. " +
                    "Preferred tool order: targeted file_read or grep/glob/lsp -> git_tool(diff) when available -> evidence-first findings.",
                FailurePatternFocus = "Review focus: severity accuracy (P0-P3), file-grounded evidence, and unresolved-risk clarity.",
-                FollowUpTaskLine = "작업 유형: review-follow-up. Task type: review-follow-up. For each finding, state status as Fixed or Unfixed with verification evidence.\n",
-                FailureInvestigationTaskLine = "추가 점검: 리뷰에서 지적된 위험은 반드시 수정 근거나 미해결 사유/영향을 남기세요. Extra check: every risk must have either a concrete fix or an explicit unresolved rationale and impact.\n",
+                FollowUpTaskLine = "Task type: review-follow-up. For each finding, state status as Fixed or Unfixed with verification evidence.\n",
+                FailureInvestigationTaskLine = "Extra check: every risk must have either a concrete fix or an explicit unresolved rationale and impact.\n",
                FinalReportTaskLine = "For review, list P0-P3 findings with file evidence and split into Fixed vs Unfixed with residual risk.\n",
                IsReviewTask = true
            },
@@ -69,10 +69,10 @@ internal sealed class TaskTypePolicy
                TaskType = "docs",
                GuidanceMessage =
                    "[System:TaskType] This is a document/content task. " +
-                    "If the user asks you to CREATE/WRITE a new document, skip file exploration entirely — " +
-                    "go directly to the creation tool (docx_create, html_create, excel_create, etc.) and use document_plan only when it materially helps structure a multi-section document. " +
-                    "to produce a REAL FILE on disk. Do NOT respond with text only — the output MUST be a file. " +
-                    "If the user asks you to READ/ANALYZE existing documents, use: glob/grep -> document_read/file_read -> analysis. " +
+                    "If the user asks you to create or write a new document, skip file exploration entirely and go directly to the creation tool (docx_create, html_create, excel_create, etc.). " +
+                    "Use document_plan only when it materially helps structure a multi-section document. " +
+                    "Prefer producing a real file on disk when the request is clearly for a deliverable, but do not force file creation for analysis-only or advisory requests. " +
+                    "If the user asks you to read or analyze existing documents, use: glob/grep -> document_read/file_read -> analysis. " +
                    "Use folder_map only when the user explicitly asks for folder contents or directory structure.",
                FailurePatternFocus = "Check source evidence and document completeness first.",
                FollowUpTaskLine = "",