IBM vLLM 도구 호출 스트리밍과 모델 프로파일 기반 실행 정책 강화

- IBM 배포형 도구 호출 바디에 프로파일 기반 tool temperature를 적용하고 tool_call_strict 프로파일에서 더 직접적인 tool-only 지시를 추가함 - IBM 경로가 tool_choice를 거부할 때 tool_choice만 제거한 대체 강제 재시도 경로를 추가함 - OpenAI/vLLM tool-use 응답을 SSE로 수신하고 delta.tool_calls를 부분 조립해 도구 호출을 더 빨리 감지하도록 변경함 - read-only 도구 조기 실행과 결과 재사용 경로를 도입해 Cowork/Code 도구 착수 속도를 개선함 - README와 DEVELOPMENT 문서를 2026-04-08 11:14(KST) 기준으로 갱신함 검증 - dotnet build src/AxCopilot/AxCopilot.csproj -c Release -v minimal -p:OutputPath=bin\verify\ -p:IntermediateOutputPath=obj\verify\ - 경고 0 / 오류 0
2026-04-08 16:48:11 +09:00
parent a2c952879d
commit 90ef3400f6
20 changed files with 1231 additions and 241 deletions
--- a/src/AxCopilot/Views/ChatWindow.ContextUsagePresentation.cs
+++ b/src/AxCopilot/Views/ChatWindow.ContextUsagePresentation.cs
@@ -6,6 +6,11 @@ namespace AxCopilot.Views;

 public partial class ChatWindow
 {
+    // 토큰 추정 캐시: 메시지 수/대화 ID가 바뀔 때만 재계산
+    private int _cachedMessageTokens;
+    private int _cachedMessageCountForTokens = -1;
+    private string? _cachedConvIdForTokens;
+
    private void RefreshContextUsageVisual()
    {
        if (TokenUsageCard == null || TokenUsageArc == null || TokenUsagePercentText == null
@@ -27,11 +32,22 @@ public partial class ChatWindow
        var triggerPercent = Math.Clamp(llm.ContextCompactTriggerPercent, 10, 95);
        var triggerRatio = triggerPercent / 100.0;

+        // 메시지 토큰 추정: 메시지 수나 대화 ID가 바뀔 때만 재계산 (타이핑 중 반복 계산 방지)
        int messageTokens;
        lock (_convLock)
-            messageTokens = _currentConversation?.Messages?.Count > 0
-                ? Services.TokenEstimator.EstimateMessages(_currentConversation.Messages)
-                : 0;
+        {
+            var convId = _currentConversation?.Id;
+            var msgCount = _currentConversation?.Messages?.Count ?? 0;
+            if (convId != _cachedConvIdForTokens || msgCount != _cachedMessageCountForTokens)
+            {
+                _cachedMessageTokens = msgCount > 0
+                    ? Services.TokenEstimator.EstimateMessages(_currentConversation!.Messages)
+                    : 0;
+                _cachedConvIdForTokens = convId;
+                _cachedMessageCountForTokens = msgCount;
+            }
+            messageTokens = _cachedMessageTokens;
+        }

        var draftText = InputBox?.Text ?? "";
        var draftTokens = string.IsNullOrWhiteSpace(draftText) ? 0 : Services.TokenEstimator.Estimate(draftText) + 4;