코워크 문서 생성 게이트와 코드 후속 검증 게이트를 claw-code 기준으로 경량화

- balanced/tool_call_strict 프로필의 document_plan 재시도와 공격적 문서 fallback 개입을 줄여 Cowork 루프를 더 얇게 정리함

- document_plan 성공 직후 강제 user follow-up 주입을 제거하고 terminal 문서 도구 성공 시 Cowork에서 바로 종료할 수 있게 조정함

- CodeDiffGate, RecentExecutionGate, ExecutionSuccessGate를 review 작업 중심으로 제한해 일반 코드 수정의 과검증을 완화함

- TaskTypePolicy, SystemPromptBuilder, cowork preset을 함께 맞춰 문서 생성/분석형 요청의 종료 조건을 일관되게 정리함

- README.md 및 docs/DEVELOPMENT.md를 2026-04-12 23:05 (KST) 기준으로 갱신함

- 검증: dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\\verify\\ -p:IntermediateOutputPath=obj\\verify\\ (경고 0, 오류 0)
This commit is contained in:
2026-04-12 22:15:26 +09:00
parent fb0bea41f7
commit 4db75d46cd
10 changed files with 86 additions and 54 deletions

View File

@@ -600,7 +600,10 @@ public partial class AgentLoopService
}
// ── 2단계: document_plan 직접 생성 폴백 ──
if (documentPlanCalled && !string.IsNullOrEmpty(documentPlanScaffold) && !_docFallbackAttempted)
if (executionPolicy.PreferAggressiveDocumentFallback
&& documentPlanCalled
&& !string.IsNullOrEmpty(documentPlanScaffold)
&& !_docFallbackAttempted)
{
_docFallbackAttempted = true;
EmitEvent(AgentEventType.Thinking, "", "앱에서 직접 문서를 생성합니다...");
@@ -870,6 +873,7 @@ public partial class AgentLoopService
// document_plan은 호출됐지만 terminal 문서 도구(html_create 등)가 미호출인 경우 → 프로파일 기준 재시도
if (requiresConcreteArtifactOrEdit
&& executionPolicy.DocumentPlanRetryMax > 0
&& documentPlanCalled && postDocumentPlanRetry < documentPlanRetryMax)
{
postDocumentPlanRetry++;
@@ -886,6 +890,7 @@ public partial class AgentLoopService
// 재시도도 모두 소진 → 앱이 직접 본문 생성 후 html_create 강제 실행
if (requiresConcreteArtifactOrEdit
&& executionPolicy.PreferAggressiveDocumentFallback
&& documentPlanCalled && !string.IsNullOrEmpty(documentPlanScaffold) && !_docFallbackAttempted)
{
_docFallbackAttempted = true;
@@ -949,7 +954,8 @@ public partial class AgentLoopService
// LLM이 도구를 한 번도 호출하지 않고 텍스트만 반환 + 문서 생성 요청이면 → 앱이 직접 HTML 파일로 저장
// 주의: 이미 도구가 실행된 경우(totalToolCalls > 0)에는 폴백하지 않음 (중복 파일 방지)
if (!_docFallbackAttempted && totalToolCalls == 0
if (executionPolicy.PreferAggressiveDocumentFallback
&& !_docFallbackAttempted && totalToolCalls == 0
&& !string.IsNullOrEmpty(textResponse)
&& IsDocumentCreationRequest(userQuery))
{
@@ -986,6 +992,7 @@ public partial class AgentLoopService
if (TryApplyCodeDiffEvidenceGateTransition(
messages,
textResponse,
taskPolicy,
runState,
executionPolicy))
continue;

View File

@@ -28,16 +28,7 @@ public partial class AgentLoopService
return;
var toolHint = ResolveDocumentPlanFollowUpTool(po);
messages.Add(new ChatMessage
{
Role = "user",
Content =
"document_plan이 완료되었습니다. " +
"방금 생성된 골격의 [내용...] 자리와 각 섹션 내용을 실제 상세 본문으로 모두 채운 뒤 " +
$"{toolHint} 도구를 지금 즉시 호출하세요. " +
"설명만 하지 말고 실제 문서 생성 도구 호출로 바로 이어가세요."
});
EmitEvent(AgentEventType.Thinking, "", $"문서 개요 완료 · {toolHint} 실행 유도");
EmitEvent(AgentEventType.Thinking, "", $"Document structure ready · next creation candidate {toolHint}");
}
private static string? ExtractDocumentPlanScaffold(string output)
@@ -47,7 +38,6 @@ public partial class AgentLoopService
var markers = new (string Start, string End)[]
{
("--- body 시작 ---", "--- body 끝 ---"),
("--- body start ---", "--- body end ---"),
("<!-- body start marker -->", "<!-- body end marker -->"),
};
@@ -76,10 +66,10 @@ public partial class AgentLoopService
if (string.IsNullOrWhiteSpace(output))
return false;
return output.Contains("즉시 실행", StringComparison.OrdinalIgnoreCase)
|| output.Contains("immediate next step", StringComparison.OrdinalIgnoreCase)
return output.Contains("immediate next step", StringComparison.OrdinalIgnoreCase)
|| output.Contains("call html_create", StringComparison.OrdinalIgnoreCase)
|| output.Contains("call document_assemble", StringComparison.OrdinalIgnoreCase);
|| output.Contains("call document_assemble", StringComparison.OrdinalIgnoreCase)
|| output.Contains("call docx_create", StringComparison.OrdinalIgnoreCase);
}
private static string ResolveDocumentPlanFollowUpTool(string output)
@@ -109,11 +99,6 @@ public partial class AgentLoopService
if (!result.Success || !IsTerminalDocumentTool(call.ToolName) || toolCalls.Count != 1)
return (false, false);
// document_plan 없이 바로 문서 도구가 호출된 경우 — 아직 LLM이 추가 반복을 할 수 있음.
// 한 번에 생성된 문서는 내용이 부실할 수 있으므로 조기 종료하지 않고 LLM에 판단을 맡긴다.
if (!_docFallbackAttempted && !documentPlanWasCalled)
return (false, false);
if (!string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
{
EmitEvent(AgentEventType.Complete, "", "에이전트 작업 완료");
@@ -165,15 +150,12 @@ public partial class AgentLoopService
if (!string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
return false;
if (string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
{
var highImpactCodeChange = IsHighImpactCodeModification(ActiveTab ?? "", call.ToolName, result);
var hasDiffEvidence = HasDiffEvidenceAfterLastModification(messages);
var hasRecentBuildOrTestEvidence = HasBuildOrTestEvidenceAfterLastModification(messages);
var highImpactCodeChange = IsHighImpactCodeModification(ActiveTab ?? "", call.ToolName, result);
var hasDiffEvidence = HasDiffEvidenceAfterLastModification(messages);
var hasRecentBuildOrTestEvidence = HasBuildOrTestEvidenceAfterLastModification(messages);
if (!highImpactCodeChange || (hasDiffEvidence && hasRecentBuildOrTestEvidence))
return false;
}
if (!highImpactCodeChange || (hasDiffEvidence && hasRecentBuildOrTestEvidence))
return false;
await RunPostToolVerificationAsync(messages, call.ToolName, result, context, ct);
return true;

View File

@@ -123,12 +123,16 @@ public partial class AgentLoopService
private bool TryApplyCodeDiffEvidenceGateTransition(
List<ChatMessage> messages,
string? textResponse,
TaskTypePolicy taskPolicy,
RunState runState,
ModelExecutionProfileCatalog.ExecutionPolicy executionPolicy)
{
if (!string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
return false;
if (!taskPolicy.IsReviewTask)
return false;
if (executionPolicy.CodeDiffGateMaxRetries <= 0 || runState.CodeDiffGateRetry >= executionPolicy.CodeDiffGateMaxRetries)
return false;
@@ -157,6 +161,9 @@ public partial class AgentLoopService
if (!string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
return false;
if (!taskPolicy.IsReviewTask)
return false;
if (executionPolicy.RecentExecutionGateMaxRetries <= 0 || runState.RecentExecutionGateRetry >= executionPolicy.RecentExecutionGateMaxRetries)
return false;
@@ -191,6 +198,9 @@ public partial class AgentLoopService
if (!string.Equals(ActiveTab, "Code", StringComparison.OrdinalIgnoreCase))
return false;
if (!taskPolicy.IsReviewTask)
return false;
if (executionPolicy.ExecutionSuccessGateMaxRetries <= 0 || runState.ExecutionSuccessGateRetry >= executionPolicy.ExecutionSuccessGateMaxRetries)
return false;

View File

@@ -58,8 +58,8 @@ public static class ModelExecutionProfileCatalog
NoToolResponseThreshold: 1,
NoToolRecoveryMaxRetries: 4, // IBM/Qwen 등 chatty 모델: 재시도 횟수 늘려 도구 호출 강제
PlanExecutionRetryMax: 2,
DocumentPlanRetryMax: 2,
PreferAggressiveDocumentFallback: true,
DocumentPlanRetryMax: 0,
PreferAggressiveDocumentFallback: false,
ReduceEarlyMemoryPressure: true,
EnablePostToolVerification: false,
EnableCodeQualityGates: true,
@@ -154,12 +154,12 @@ public static class ModelExecutionProfileCatalog
"balanced",
"균형",
ForceInitialToolCall: true,
ForceToolCallAfterPlan: true,
ForceToolCallAfterPlan: false,
ToolTemperatureCap: 0.35,
NoToolResponseThreshold: 2,
NoToolRecoveryMaxRetries: 2,
PlanExecutionRetryMax: 2,
DocumentPlanRetryMax: 2,
DocumentPlanRetryMax: 0,
PreferAggressiveDocumentFallback: false,
ReduceEarlyMemoryPressure: false,
EnablePostToolVerification: false,

View File

@@ -23,10 +23,10 @@ internal sealed class TaskTypePolicy
GuidanceMessage =
"[System:TaskType] This is a bug-fix task. Prioritize reproduction evidence, root cause linkage, smallest safe fix, and regression verification. " +
"Preferred tool order: targeted file_read or grep/glob/lsp -> file_edit -> build_run/test_loop as needed -> git_tool(diff) when it helps confirm the final change.",
FailurePatternFocus = "재현 조건과 원인 연결을 먼저 확인하세요. Check reproduction conditions and root-cause linkage first.",
FollowUpTaskLine = "작업 유형: bugfix. Task type: bugfix. Verify the fix is directly linked to the symptom and confirm non-regression.\n",
FailureInvestigationTaskLine = "추가 점검: 재현 조건 기준으로 증상이 재현되지 않는지와 원인 연결이 타당한지 확인하세요. Extra check: confirm the symptom is no longer reproducible and root-cause linkage is valid.\n",
FinalReportTaskLine = "버그 수정은 원인, 수정 내용, 재현/회귀 검증 근거를 포함하세요. For bug fixes, include root cause, change summary, and reproduction/regression evidence.\n",
FailurePatternFocus = "Check reproduction conditions and root-cause linkage first.",
FollowUpTaskLine = "Task type: bugfix. Verify the fix is directly linked to the symptom and confirm non-regression.\n",
FailureInvestigationTaskLine = "Extra check: confirm the symptom is no longer reproducible and root-cause linkage is valid.\n",
FinalReportTaskLine = "For bug fixes, include root cause, change summary, and reproduction/regression evidence.\n",
},
"feature" => new TaskTypePolicy
{
@@ -36,8 +36,8 @@ internal sealed class TaskTypePolicy
"Preferred tool order: targeted file_read or grep/glob/lsp -> file_edit/file_write -> build_run/test_loop as needed -> git_tool(diff). " +
"Use folder_map only when the user explicitly needs folder structure or file listing.",
FailurePatternFocus = "Check new behavior flow and caller linkage first.",
FollowUpTaskLine = "작업 유형: feature. Task type: feature. Verify behavior flow, input/output path, caller impact, and test additions.\n",
FailureInvestigationTaskLine = "추가 점검: 새 기능 경로와 호출부 연결이 의도대로 동작하는지 확인하세요. Extra check: confirm feature path and caller linkage behave as intended.\n",
FollowUpTaskLine = "Task type: feature. Verify behavior flow, input/output path, caller impact, and test additions.\n",
FailureInvestigationTaskLine = "Extra check: confirm feature path and caller linkage behave as intended.\n",
FinalReportTaskLine = "For features, include behavior flow, impacted files/callers, and verification evidence.\n",
},
"refactor" => new TaskTypePolicy
@@ -47,8 +47,8 @@ internal sealed class TaskTypePolicy
"[System:TaskType] This is a refactor task. Prioritize behavior preservation, reference impact, diff review, and non-regression evidence. " +
"Preferred tool order: targeted file_read or grep/glob/lsp -> file_edit -> build_run/test_loop as needed -> git_tool(diff).",
FailurePatternFocus = "Check behavior preservation and impact scope first.",
FollowUpTaskLine = "작업 유형: refactor. Task type: refactor. Prioritize behavior-preservation evidence over cosmetic cleanup.\n",
FailureInvestigationTaskLine = "추가 점검: 동작 보존 관점에서 기존 호출 흐름이 동일하게 유지되는지 확인하세요. Extra check: validate existing call flow remains behavior-compatible.\n",
FollowUpTaskLine = "Task type: refactor. Prioritize behavior-preservation evidence over cosmetic cleanup.\n",
FailureInvestigationTaskLine = "Extra check: validate existing call flow remains behavior-compatible.\n",
FinalReportTaskLine = "For refactors, include behavior-preservation evidence and impact scope.\n",
},
"review" => new TaskTypePolicy
@@ -59,8 +59,8 @@ internal sealed class TaskTypePolicy
"Report findings with P0-P3 severity and file evidence, then separate Fixed vs Unfixed status. " +
"Preferred tool order: targeted file_read or grep/glob/lsp -> git_tool(diff) when available -> evidence-first findings.",
FailurePatternFocus = "Review focus: severity accuracy (P0-P3), file-grounded evidence, and unresolved-risk clarity.",
FollowUpTaskLine = "작업 유형: review-follow-up. Task type: review-follow-up. For each finding, state status as Fixed or Unfixed with verification evidence.\n",
FailureInvestigationTaskLine = "추가 점검: 리뷰에서 지적된 위험은 반드시 수정 근거나 미해결 사유/영향을 남기세요. Extra check: every risk must have either a concrete fix or an explicit unresolved rationale and impact.\n",
FollowUpTaskLine = "Task type: review-follow-up. For each finding, state status as Fixed or Unfixed with verification evidence.\n",
FailureInvestigationTaskLine = "Extra check: every risk must have either a concrete fix or an explicit unresolved rationale and impact.\n",
FinalReportTaskLine = "For review, list P0-P3 findings with file evidence and split into Fixed vs Unfixed with residual risk.\n",
IsReviewTask = true
},
@@ -69,10 +69,10 @@ internal sealed class TaskTypePolicy
TaskType = "docs",
GuidanceMessage =
"[System:TaskType] This is a document/content task. " +
"If the user asks you to CREATE/WRITE a new document, skip file exploration entirely " +
"go directly to the creation tool (docx_create, html_create, excel_create, etc.) and use document_plan only when it materially helps structure a multi-section document. " +
"to produce a REAL FILE on disk. Do NOT respond with text only — the output MUST be a file. " +
"If the user asks you to READ/ANALYZE existing documents, use: glob/grep -> document_read/file_read -> analysis. " +
"If the user asks you to create or write a new document, skip file exploration entirely and go directly to the creation tool (docx_create, html_create, excel_create, etc.). " +
"Use document_plan only when it materially helps structure a multi-section document. " +
"Prefer producing a real file on disk when the request is clearly for a deliverable, but do not force file creation for analysis-only or advisory requests. " +
"If the user asks you to read or analyze existing documents, use: glob/grep -> document_read/file_read -> analysis. " +
"Use folder_map only when the user explicitly asks for folder contents or directory structure.",
FailurePatternFocus = "Check source evidence and document completeness first.",
FollowUpTaskLine = "",