From 99990b9778153d1614f53b7951bc531710f6fc22 Mon Sep 17 00:00:00 2001 From: lacvet Date: Wed, 15 Apr 2026 14:57:55 +0900 Subject: [PATCH] =?UTF-8?q?=EF=BB=BF=EC=BD=94=EB=93=9C=20=ED=83=AD=20?= =?UTF-8?q?=EC=B2=AB=20LLM=20=EC=9D=91=EB=8B=B5=20=EB=8C=80=EA=B8=B0=20?= =?UTF-8?q?=EC=A7=84=EB=8B=A8=EC=9D=84=20=EA=B0=95=ED=99=94=ED=95=98?= =?UTF-8?q?=EA=B3=A0=20heartbeat=20=EC=83=81=ED=83=9C=EB=A5=BC=20=EB=85=B8?= =?UTF-8?q?=EC=B6=9C=ED=95=9C=EB=8B=A4?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 설치형 환경에서 Code 탭 작업이 오래 걸릴 때 첫 도구 호출 전에 정체되는 구간을 추적할 수 있도록 StreamingToolExecutionCoordinator에 대기 heartbeat와 첫 응답 수신 로그를 추가했다. 첫 응답 전에는 모델 요청 시작, 응답 대기 시간, 첫 응답 수신 시점을 AgentLoopWait 로그와 Thinking 이벤트로 남기고, 이후 응답 지연도 heartbeat로 표시하도록 조정했다. 함께 StreamingToolExecutionCoordinatorTests를 추가해 첫 응답 지연 시 heartbeat가 노출되는 경로와 빠른 응답 시 불필요한 heartbeat가 생기지 않는 경로를 고정했다. README.md와 docs/DEVELOPMENT.md에 2026-04-15 14:55 (KST) 기준 이력을 반영했고, dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\\verify_llm_wait_diag\\ -p:IntermediateOutputPath=obj\\verify_llm_wait_diag\\ 경고 0/오류 0, dotnet test src/AxCopilot.Tests/AxCopilot.Tests.csproj -c Release -v minimal --filter "StreamingToolExecutionCoordinatorTests|AgentLoopLlmRequestPreparationServiceTests|AgentLoopIterationPreparationServiceTests" -p:OutputPath=bin\\verify_llm_wait_diag_tests\\ -p:IntermediateOutputPath=obj\\verify_llm_wait_diag_tests\\ 통과 6을 확인했다. --- README.md | 7 + docs/DEVELOPMENT.md | 18 +++ .../StreamingToolExecutionCoordinatorTests.cs | 149 ++++++++++++++++++ .../StreamingToolExecutionCoordinator.cs | 63 +++++++- 4 files changed, 235 insertions(+), 2 deletions(-) create mode 100644 src/AxCopilot.Tests/Services/StreamingToolExecutionCoordinatorTests.cs diff --git a/README.md b/README.md index 90871e8..dd3004f 100644 --- a/README.md +++ b/README.md @@ -2086,3 +2086,10 @@ MIT License - 검증: - `dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\\verify_runtime_policy_alignment_build\\ -p:IntermediateOutputPath=obj\\verify_runtime_policy_alignment_build\\` 경고 0 / 오류 0 - `dotnet test src/AxCopilot.Tests/AxCopilot.Tests.csproj -c Release -v minimal --filter "AgentLoopE2ETests|AgentLoopResponseClassificationServiceTests" -p:OutputPath=bin\\verify_runtime_policy_alignment\\ -p:IntermediateOutputPath=obj\\verify_runtime_policy_alignment\\` 통과 19 + +업데이트: 2026-04-15 14:55 (KST) +- Code 탭 장시간 대기 진단을 보강했습니다. [StreamingToolExecutionCoordinator.cs](/E:/AX%20Copilot%20-%20Codex/src/AxCopilot/Services/Agent/StreamingToolExecutionCoordinator.cs)는 LLM 첫 응답 전/후 정체 구간을 heartbeat 방식으로 감지해 `메인 루프 N: 모델에 요청하는 중입니다...`, `모델 첫 응답을 기다리는 중입니다...`, `모델 첫 응답을 받아 계속 진행합니다.` 같은 진행 상태를 이벤트와 앱 로그에 함께 남깁니다. +- 앱 로그에는 이제 `[AgentLoopWait]` 접두사로 `service/model/messages/tools/forceToolCall`과 첫 응답 수신 시간(ms)이 기록됩니다. 설치본에서 Code 작업이 느릴 때 모델 대기인지, 중간 응답 정체인지 로그만으로 더 빠르게 구분할 수 있습니다. +- 새 [StreamingToolExecutionCoordinatorTests.cs](/E:/AX%20Copilot%20-%20Codex/src/AxCopilot.Tests/Services/StreamingToolExecutionCoordinatorTests.cs)는 느린 첫 응답일 때 waiting heartbeat가 발생하는지와 빠른 응답에서는 불필요한 wait 문구를 남기지 않는지를 회귀 검증합니다. +- 검증: `dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\\verify_llm_wait_diag\\ -p:IntermediateOutputPath=obj\\verify_llm_wait_diag\\` 경고 0 / 오류 0 +- 검증: `dotnet test src/AxCopilot.Tests/AxCopilot.Tests.csproj -c Release -v minimal --filter "StreamingToolExecutionCoordinatorTests|AgentLoopLlmRequestPreparationServiceTests|AgentLoopIterationPreparationServiceTests" -p:OutputPath=bin\\verify_llm_wait_diag_tests\\ -p:IntermediateOutputPath=obj\\verify_llm_wait_diag_tests\\` 통과 6 diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md index 0d6d93d..3dd9d14 100644 --- a/docs/DEVELOPMENT.md +++ b/docs/DEVELOPMENT.md @@ -1372,3 +1372,21 @@ UI ?붿옄???€洹쒕え 由ы뙥?좊쭅 ???꾪뿕 ?묒뾽 ??湲곕줉???덉쟾 - 테스트: `src/AxCopilot.Tests/Services/SettingsServiceTests.cs`에 `LlmSettings_DefaultMaxContextTokens_IsThirtyTwoK` 추가 - 검증: `dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\\verify_context32k\\ -p:IntermediateOutputPath=obj\\verify_context32k\\` 경고 0 / 오류 0 - 검증: `dotnet test src/AxCopilot.Tests/AxCopilot.Tests.csproj -c Release -v minimal --filter "SettingsServiceTests" -p:OutputPath=bin\\verify_context32k_tests\\ -p:IntermediateOutputPath=obj\\verify_context32k_tests\\` 통과 32 + +업데이트: 2026-04-15 14:55 (KST) + +### Code 탭 LLM 대기 heartbeat 진단 보강 +- `src/AxCopilot/Services/Agent/StreamingToolExecutionCoordinator.cs` + - 첫 응답 전 `LLM 요청 시작`, 첫 응답 수신 시점, 이후 장시간 대기 구간을 `[AgentLoopWait]` 로그로 남기도록 보강했습니다. + - `onStreamEventAsync`가 있는 스트리밍 경로에서 첫 이벤트가 늦게 오면 `메인 루프 N: 모델 첫 응답을 기다리는 중입니다... (n초)` heartbeat를 주기적으로 `Thinking` 이벤트로 발행합니다. + - 첫 이벤트가 늦게 도착한 경우 `메인 루프 N: 모델 첫 응답을 받아 계속 진행합니다.`를 한 번 더 기록해, 설치본 로그와 UI 진행 이력만으로도 “아예 멈춤”과 “응답 대기”를 구분할 수 있게 했습니다. + - 빠른 응답에서는 waiting heartbeat를 남기지 않도록 기본 heartbeat 지연값을 두고, 테스트에서는 짧은 지연값으로만 동작을 검증합니다. + +### 테스트 +- 새 `src/AxCopilot.Tests/Services/StreamingToolExecutionCoordinatorTests.cs` + - 느린 첫 응답일 때 waiting heartbeat와 first-response recovery 메시지가 나오는지 검증 + - 빠른 첫 응답에서는 대기 heartbeat가 발생하지 않는지 검증 + +### 검증 +- `dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\\verify_llm_wait_diag\\ -p:IntermediateOutputPath=obj\\verify_llm_wait_diag\\` 경고 0 / 오류 0 +- `dotnet test src/AxCopilot.Tests/AxCopilot.Tests.csproj -c Release -v minimal --filter "StreamingToolExecutionCoordinatorTests|AgentLoopLlmRequestPreparationServiceTests|AgentLoopIterationPreparationServiceTests" -p:OutputPath=bin\\verify_llm_wait_diag_tests\\ -p:IntermediateOutputPath=obj\\verify_llm_wait_diag_tests\\` 통과 6 diff --git a/src/AxCopilot.Tests/Services/StreamingToolExecutionCoordinatorTests.cs b/src/AxCopilot.Tests/Services/StreamingToolExecutionCoordinatorTests.cs new file mode 100644 index 0000000..90a04ac --- /dev/null +++ b/src/AxCopilot.Tests/Services/StreamingToolExecutionCoordinatorTests.cs @@ -0,0 +1,149 @@ +using System.Runtime.CompilerServices; +using System.Text.Json; +using AxCopilot.Models; +using AxCopilot.Services; +using AxCopilot.Services.Agent; +using FluentAssertions; +using Xunit; + +namespace AxCopilot.Tests.Services; + +public class StreamingToolExecutionCoordinatorTests +{ + [Fact] + public async Task SendWithToolsWithRecoveryAsync_ShouldEmitWaitHeartbeatBeforeFirstStreamEvent() + { + var llm = new FakeLlmService + { + InitialDelay = TimeSpan.FromMilliseconds(90), + Events = + [ + new ToolStreamEvent(ToolStreamEventKind.TextDelta, "clock"), + new ToolStreamEvent(ToolStreamEventKind.Completed) + ] + }; + + var emitted = new List(); + var coordinator = CreateCoordinator( + llm, + emitted, + firstResponseHeartbeatDelay: TimeSpan.FromMilliseconds(20), + responseHeartbeatInterval: TimeSpan.FromMilliseconds(20)); + + var result = await coordinator.SendWithToolsWithRecoveryAsync( + [new ChatMessage { Role = "user", Content = "make a clock page" }], + Array.Empty(), + CancellationToken.None, + "메인 루프 1", + onStreamEventAsync: _ => Task.CompletedTask); + + result.Should().ContainSingle(block => block.Type == "text" && block.Text == "clock"); + emitted.Should().Contain(evt => + evt.Type == AgentEventType.Thinking && + evt.Summary.Contains("모델 첫 응답을 기다리는 중입니다", StringComparison.Ordinal)); + emitted.Should().Contain(evt => + evt.Type == AgentEventType.Thinking && + evt.Summary.Contains("모델 첫 응답을 받아 계속 진행합니다", StringComparison.Ordinal)); + } + + [Fact] + public async Task SendWithToolsWithRecoveryAsync_ShouldSkipWaitHeartbeatWhenFirstStreamEventIsQuick() + { + var llm = new FakeLlmService + { + InitialDelay = TimeSpan.FromMilliseconds(5), + Events = + [ + new ToolStreamEvent(ToolStreamEventKind.TextDelta, "ok"), + new ToolStreamEvent(ToolStreamEventKind.Completed) + ] + }; + + var emitted = new List(); + var coordinator = CreateCoordinator( + llm, + emitted, + firstResponseHeartbeatDelay: TimeSpan.FromMilliseconds(80), + responseHeartbeatInterval: TimeSpan.FromMilliseconds(80)); + + var result = await coordinator.SendWithToolsWithRecoveryAsync( + [new ChatMessage { Role = "user", Content = "respond quickly" }], + Array.Empty(), + CancellationToken.None, + "메인 루프 1", + onStreamEventAsync: _ => Task.CompletedTask); + + result.Should().ContainSingle(block => block.Type == "text" && block.Text == "ok"); + emitted.Should().NotContain(evt => + evt.Summary.Contains("모델 첫 응답을 기다리는 중입니다", StringComparison.Ordinal)); + emitted.Should().Contain(evt => + evt.Type == AgentEventType.Thinking && + evt.Summary.Contains("모델에 요청하는 중입니다", StringComparison.Ordinal)); + } + + private static StreamingToolExecutionCoordinator CreateCoordinator( + ILlmService llm, + List emitted, + TimeSpan firstResponseHeartbeatDelay, + TimeSpan responseHeartbeatInterval) + { + return new StreamingToolExecutionCoordinator( + llm, + (name, _) => name, + (_, _, _, _, _) => Task.FromResult(ToolResult.Fail("unused")), + (type, toolName, summary) => emitted.Add(new AgentEvent + { + Type = type, + ToolName = toolName, + Summary = summary + }), + _ => false, + _ => false, + _ => false, + (_, _) => 0, + firstResponseHeartbeatDelay, + responseHeartbeatInterval); + } + + private sealed class FakeLlmService : ILlmService + { + public TimeSpan InitialDelay { get; init; } = TimeSpan.Zero; + public List Events { get; init; } = []; + + public (string service, string model) GetCurrentModelInfo() => ("deepseek", "deepseek-chat"); + public Task SendAsync(List messages, CancellationToken ct = default) => Task.FromResult(string.Empty); + public async IAsyncEnumerable StreamAsync(List messages, [EnumeratorCancellation] CancellationToken ct = default) + { + await Task.CompletedTask; + yield break; + } + public Task<(bool ok, string message)> TestConnectionAsync() => Task.FromResult((true, "ok")); + public void PushRouteOverride(string service, string model) { } + public void ClearRouteOverride() { } + public void PushInferenceOverride(string? service = null, string? model = null, double? temperature = null, string? reasoningEffort = null) { } + public void PopInferenceOverride() { } + public TokenUsage? LastTokenUsage => null; + public RuntimeConnectionSnapshot GetRuntimeConnectionSnapshot() => new("deepseek", "deepseek-chat", "", false, true); + public Task> SendWithToolsAsync(List messages, IReadOnlyCollection tools, CancellationToken ct = default, bool forceToolCall = false, Func>? prefetchToolCallAsync = null) + => Task.FromResult(new List()); + public async IAsyncEnumerable StreamWithToolsAsync( + List messages, + IReadOnlyCollection tools, + bool forceToolCall = false, + Func>? prefetchToolCallAsync = null, + [EnumeratorCancellation] CancellationToken ct = default) + { + if (InitialDelay > TimeSpan.Zero) + await Task.Delay(InitialDelay, ct); + + foreach (var evt in Events) + yield return evt; + } + + public ModelExecutionProfileCatalog.ExecutionPolicy GetActiveExecutionPolicy() + => ModelExecutionProfileCatalog.Get("balanced"); + + public string? SystemPrompt => null; + public void Dispose() { } + } +} diff --git a/src/AxCopilot/Services/Agent/StreamingToolExecutionCoordinator.cs b/src/AxCopilot/Services/Agent/StreamingToolExecutionCoordinator.cs index cb170f1..de820ed 100644 --- a/src/AxCopilot/Services/Agent/StreamingToolExecutionCoordinator.cs +++ b/src/AxCopilot/Services/Agent/StreamingToolExecutionCoordinator.cs @@ -25,6 +25,8 @@ internal sealed class StreamingToolExecutionCoordinator : IToolExecutionCoordina private readonly Func, bool> _forceContextRecovery; private readonly Func _isTransientLlmError; private readonly Func _computeTransientBackoffDelayMs; + private readonly TimeSpan _firstResponseHeartbeatDelay; + private readonly TimeSpan _responseHeartbeatInterval; public StreamingToolExecutionCoordinator( ILlmService llm, @@ -34,7 +36,9 @@ internal sealed class StreamingToolExecutionCoordinator : IToolExecutionCoordina Func isContextOverflowError, Func, bool> forceContextRecovery, Func isTransientLlmError, - Func computeTransientBackoffDelayMs) + Func computeTransientBackoffDelayMs, + TimeSpan? firstResponseHeartbeatDelay = null, + TimeSpan? responseHeartbeatInterval = null) { _llm = llm; _resolveRequestedToolName = resolveRequestedToolName; @@ -44,6 +48,8 @@ internal sealed class StreamingToolExecutionCoordinator : IToolExecutionCoordina _forceContextRecovery = forceContextRecovery; _isTransientLlmError = isTransientLlmError; _computeTransientBackoffDelayMs = computeTransientBackoffDelayMs; + _firstResponseHeartbeatDelay = firstResponseHeartbeatDelay ?? TimeSpan.FromSeconds(8); + _responseHeartbeatInterval = responseHeartbeatInterval ?? TimeSpan.FromSeconds(15); } public async Task TryPrefetchReadOnlyToolAsync( @@ -105,8 +111,51 @@ internal sealed class StreamingToolExecutionCoordinator : IToolExecutionCoordina var blocks = new List(); var textBuilder = new StringBuilder(); - await foreach (var evt in _llm.StreamWithToolsAsync(messages, tools, forceToolCall, prefetchToolCallAsync, ct).WithCancellation(ct)) + var (service, model) = _llm.GetCurrentModelInfo(); + LogService.Info( + $"[AgentLoopWait] {phaseLabel}: LLM 요청 시작 (service={service}, model={model}, messages={messages.Count}, tools={tools.Count}, forceToolCall={forceToolCall})"); + _emitEvent(AgentEventType.Thinking, "", $"{phaseLabel}: 모델에 요청하는 중입니다..."); + + var waitStopwatch = Stopwatch.StartNew(); + var firstEventReceived = false; + var nextHeartbeatAt = _firstResponseHeartbeatDelay; + + await using var stream = _llm + .StreamWithToolsAsync(messages, tools, forceToolCall, prefetchToolCallAsync, ct) + .GetAsyncEnumerator(ct); + + while (true) { + var moveNextTask = stream.MoveNextAsync().AsTask(); + while (!moveNextTask.IsCompleted) + { + var remaining = nextHeartbeatAt - waitStopwatch.Elapsed; + if (remaining < TimeSpan.Zero) + remaining = TimeSpan.Zero; + + var delayTask = Task.Delay(remaining, ct); + var completedTask = await Task.WhenAny(moveNextTask, delayTask).ConfigureAwait(false); + if (completedTask == moveNextTask) + break; + + var waited = waitStopwatch.Elapsed; + EmitWaitHeartbeat(phaseLabel, waited, firstEventReceived); + nextHeartbeatAt = waited + _responseHeartbeatInterval; + } + + if (!await moveNextTask.ConfigureAwait(false)) + break; + + var evt = stream.Current; + if (!firstEventReceived) + { + firstEventReceived = true; + LogService.Info( + $"[AgentLoopWait] {phaseLabel}: 첫 응답 수신 ({waitStopwatch.ElapsedMilliseconds}ms, kind={evt.Kind})"); + if (waitStopwatch.Elapsed >= _firstResponseHeartbeatDelay) + _emitEvent(AgentEventType.Thinking, "", $"{phaseLabel}: 모델 첫 응답을 받아 계속 진행합니다."); + } + await onStreamEventAsync(evt); if (evt.Kind == ToolStreamEventKind.TextDelta && !string.IsNullOrWhiteSpace(evt.Text)) { @@ -170,4 +219,14 @@ internal sealed class StreamingToolExecutionCoordinator : IToolExecutionCoordina } } } + + private void EmitWaitHeartbeat(string phaseLabel, TimeSpan waited, bool firstEventReceived) + { + var seconds = Math.Max(1, (int)Math.Round(waited.TotalSeconds)); + var summary = firstEventReceived + ? $"{phaseLabel}: 모델 응답이 길어져 계속 기다리는 중입니다... ({seconds}초)" + : $"{phaseLabel}: 모델 첫 응답을 기다리는 중입니다... ({seconds}초)"; + LogService.Info($"[AgentLoopWait] {summary}"); + _emitEvent(AgentEventType.Thinking, "", summary); + } }