IBM vLLM 도구 호출 스트리밍과 모델 프로파일 기반 실행 정책 강화

- IBM 배포형 도구 호출 바디에 프로파일 기반 tool temperature를 적용하고 tool_call_strict 프로파일에서 더 직접적인 tool-only 지시를 추가함 - IBM 경로가 tool_choice를 거부할 때 tool_choice만 제거한 대체 강제 재시도 경로를 추가함 - OpenAI/vLLM tool-use 응답을 SSE로 수신하고 delta.tool_calls를 부분 조립해 도구 호출을 더 빨리 감지하도록 변경함 - read-only 도구 조기 실행과 결과 재사용 경로를 도입해 Cowork/Code 도구 착수 속도를 개선함 - README와 DEVELOPMENT 문서를 2026-04-08 11:14(KST) 기준으로 갱신함 검증 - dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\verify\ -p:IntermediateOutputPath=obj\verify\ - 경고 0 / 오류 0
2026-04-08 16:48:11 +09:00
parent a2c952879d
commit 90ef3400f6
20 changed files with 1231 additions and 241 deletions
--- a/src/AxCopilot/Services/Agent/AgentLoopService.cs
+++ b/src/AxCopilot/Services/Agent/AgentLoopService.cs
@@ -150,6 +150,10 @@ public partial class AgentLoopService

        // 사용자 원본 요청 캡처 (문서 생성 폴백 판단용)
        var userQuery = messages.LastOrDefault(m => m.Role == "user")?.Content ?? "";
+
+        // 워크플로우 상세 로그: 에이전트 루프 시작
+        WorkflowLogService.LogAgentLifecycle(_conversationId, _currentRunId, "start",
+            userPrompt: userQuery);
        var consecutiveErrors = 0; // Self-Reflection: 연속 오류 카운터
        var totalToolCalls = 0;    // 복잡도 추정용
        string? lastFailedToolSignature = null;
@@ -510,6 +514,7 @@ public partial class AgentLoopService

                // LLM에 도구 정의와 함께 요청
                List<LlmService.ContentBlock> blocks;
+                var llmCallSw = Stopwatch.StartNew();
                try
                {
                    var activeTools = GetRuntimeActiveTools(llm.DisabledTools, runtimeOverrides);
@@ -521,14 +526,43 @@ public partial class AgentLoopService
                    // totalToolCalls == 0: 아직 한 번도 도구를 안 불렀으면 tool_choice:"required" 강제
                    // → chatty 모델(Qwen 등)이 텍스트 설명만 하고 도구를 안 부르는 현상 방지
                    var forceFirst = totalToolCalls == 0 && executionPolicy.ForceInitialToolCall;
+
+                    // IBM/Qwen 등 chatty 모델 대응: 첫 번째 호출 직전 마지막 user 메시지로 도구 호출 강제 reminder 주입.
+                    // recovery 메시지가 이미 추가된 경우(NoToolCallLoopRetry > 0)에는 중복 주입하지 않음.
+                    // 임시 메시지이므로 실제 messages 목록은 수정하지 않고, 별도 sendMessages로 전달.
+                    List<ChatMessage> sendMessages = messages;
+                    if (forceFirst
+                        && executionPolicy.InjectPreCallToolReminder
+                        && runState.NoToolCallLoopRetry == 0)
+                    {
+                        sendMessages = [.. messages, new ChatMessage
+                        {
+                            Role = "user",
+                            Content = "[TOOL_REQUIRED] 지금 즉시 도구를 1개 이상 호출하세요. 텍스트만 반환하면 거부됩니다. " +
+                                      "Call at least one tool RIGHT NOW. Text-only response is rejected."
+                        }];
+                    }
+
+                    // 워크플로우 상세 로그: LLM 요청
+                    llmCallSw.Restart();
+                    var (_, currentModel) = _llm.GetCurrentModelInfo();
+                    WorkflowLogService.LogLlmRequest(_conversationId, _currentRunId, iteration,
+                        currentModel, sendMessages.Count, activeTools.Count, forceFirst);
+
                    blocks = await SendWithToolsWithRecoveryAsync(
-                        messages,
+                        sendMessages,
                        activeTools,
                        ct,
                        $"메인 루프 {iteration}",
                        runState,
-                        forceToolCall: forceFirst);
+                        forceToolCall: forceFirst,
+                        prefetchToolCallAsync: block => TryPrefetchReadOnlyToolAsync(
+                            block,
+                            activeTools,
+                            context,
+                            ct));
                    runState.ContextRecoveryAttempts = 0;
+                    llmCallSw.Stop();
                    runState.TransientLlmErrorRetries = 0;
                    NotifyPostCompactionTurnIfNeeded(runState);
                }
@@ -646,6 +680,13 @@ public partial class AgentLoopService
                var textResponse = string.Join("\n", textParts);
                consecutiveNoToolResponses = toolCalls.Count == 0 ? consecutiveNoToolResponses + 1 : 0;

+                // 워크플로우 상세 로그: LLM 응답
+                WorkflowLogService.LogLlmResponse(_conversationId, _currentRunId, iteration,
+                    textResponse, toolCalls.Count,
+                    _llm.LastTokenUsage?.PromptTokens ?? 0,
+                    _llm.LastTokenUsage?.CompletionTokens ?? 0,
+                    llmCallSw.ElapsedMilliseconds);
+
                // Task Decomposition: 첫 번째 텍스트 응답에서 계획 단계 추출
                if (!planExtracted && !string.IsNullOrEmpty(textResponse))
                {
@@ -1126,22 +1167,27 @@ public partial class AgentLoopService
                    repeatedUnknownToolCount = 0;
                    lastDisallowedToolName = null;
                    repeatedDisallowedToolCount = 0;
-                    var effectiveCall = string.Equals(call.ToolName, resolvedToolName, StringComparison.OrdinalIgnoreCase)
+                    var effectiveToolName = !string.IsNullOrWhiteSpace(call.ResolvedToolName)
+                        ? call.ResolvedToolName
+                        : resolvedToolName;
+                    var effectiveCall = string.Equals(call.ToolName, effectiveToolName, StringComparison.OrdinalIgnoreCase)
                        ? call
                        : new LlmService.ContentBlock
                        {
                            Type = call.Type,
                            Text = call.Text,
-                            ToolName = resolvedToolName,
+                            ToolName = effectiveToolName,
                            ToolId = call.ToolId,
                            ToolInput = call.ToolInput,
+                            ResolvedToolName = effectiveToolName,
+                            PrefetchedExecutionTask = call.PrefetchedExecutionTask,
                        };
-                    if (!string.Equals(call.ToolName, resolvedToolName, StringComparison.OrdinalIgnoreCase))
+                    if (!string.Equals(call.ToolName, effectiveToolName, StringComparison.OrdinalIgnoreCase))
                    {
                        EmitEvent(
                            AgentEventType.Thinking,
-                            resolvedToolName,
-                            $"도구명 정규화 적용: '{call.ToolName}' → '{resolvedToolName}'");
+                            effectiveToolName,
+                            $"도구명 정규화 적용: '{call.ToolName}' → '{effectiveToolName}'");
                    }

                    var toolCallSignature = BuildToolCallSignature(effectiveCall);
@@ -1279,11 +1325,37 @@ public partial class AgentLoopService
                    }

                    ToolResult result;
+                    long elapsedMs;
                    var sw = Stopwatch.StartNew();
                    try
                    {
-                        var input = effectiveCall.ToolInput ?? JsonDocument.Parse("{}").RootElement;
-                        result = await ExecuteToolWithTimeoutAsync(tool, effectiveCall.ToolName, input, context, messages, ct);
+                        if (effectiveCall.PrefetchedExecutionTask != null)
+                        {
+                            var prefetched = await effectiveCall.PrefetchedExecutionTask.ConfigureAwait(false);
+                            if (prefetched != null)
+                            {
+                                result = prefetched.Result;
+                                elapsedMs = prefetched.ElapsedMilliseconds;
+                                EmitEvent(
+                                    AgentEventType.Thinking,
+                                    effectiveCall.ToolName,
+                                    $"조기 실행 결과 재사용: {effectiveCall.ToolName}");
+                            }
+                            else
+                            {
+                                var input = effectiveCall.ToolInput ?? JsonDocument.Parse("{}").RootElement;
+                                result = await ExecuteToolWithTimeoutAsync(tool, effectiveCall.ToolName, input, context, messages, ct);
+                                sw.Stop();
+                                elapsedMs = sw.ElapsedMilliseconds;
+                            }
+                        }
+                        else
+                        {
+                            var input = effectiveCall.ToolInput ?? JsonDocument.Parse("{}").RootElement;
+                            result = await ExecuteToolWithTimeoutAsync(tool, effectiveCall.ToolName, input, context, messages, ct);
+                            sw.Stop();
+                            elapsedMs = sw.ElapsedMilliseconds;
+                        }
                    }
                    catch (OperationCanceledException)
                    {
@@ -1300,8 +1372,9 @@ public partial class AgentLoopService
                    catch (Exception ex)
                    {
                        result = ToolResult.Fail($"도구 실행 오류: {ex.Message}");
+                        sw.Stop();
+                        elapsedMs = sw.ElapsedMilliseconds;
                    }
-                    sw.Stop();

                    // ── Post-Hook 실행 ──
                    if (llm.EnableToolHooks && runtimeHooks.Count > 0)
@@ -1340,7 +1413,7 @@ public partial class AgentLoopService
                        effectiveCall.ToolName,
                        TruncateOutput(result.Output, 200),
                        result.FilePath,
-                        elapsedMs: sw.ElapsedMilliseconds,
+                        elapsedMs: elapsedMs,
                        inputTokens: tokenUsage?.PromptTokens ?? 0,
                        outputTokens: tokenUsage?.CompletionTokens ?? 0,
                        toolInput: effectiveCall.ToolInput?.ToString(),
@@ -1533,6 +1606,10 @@ public partial class AgentLoopService
            if (runtimeOverrideApplied)
                _llm.PopInferenceOverride();

+            // 워크플로우 상세 로그: 에이전트 루프 종료
+            WorkflowLogService.LogAgentLifecycle(_conversationId, _currentRunId, "end",
+                summary: $"iterations={iteration}, tools={totalToolCalls}, success={statsSuccessCount}, fail={statsFailCount}");
+
            IsRunning = false;
            _currentRunId = "";
            _runPendingPostCompactionTurn = false;