Retry automatically when LLM returns only thinking blocks#177
Open
okottorika wants to merge 1 commit intoboldsoftware:mainfrom
Open
Retry automatically when LLM returns only thinking blocks#177okottorika wants to merge 1 commit intoboldsoftware:mainfrom
okottorika wants to merge 1 commit intoboldsoftware:mainfrom
Conversation
b262a9c to
257ddce
Compare
When Claude models intermittently return a response containing only thinking blocks with no text or tool_use content, the UI shows a blank turn to the user. This happens rarely but is confusing. Add detection for this case in the conversation loop: when a response contains only thinking/redacted_thinking blocks, automatically retry the LLM request instead of ending the turn. The empty response is not recorded to the database, so the user never sees it. Fixes boldsoftware#131 Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
257ddce to
7e28c09
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When Claude models intermittently return a response containing only thinking blocks with no text or tool_use content, the UI shows a blank turn to the user. The model produces a thinking block (indicating it understood the request) but generates zero output tokens beyond the end-of-turn token.
This was reported in the Discord bug channel where message 246 in a conversation received no visible response from the agent. A simple "Continue" retry from the user produced a proper 448-token response, confirming this is intermittent LLM-level behavior rather than a Shelley backend bug.
Solution
Add detection in the conversation loop (
loop/loop.go) for responses that contain only thinking/redacted_thinking blocks with no visible content. When detected, automatically retry the LLM request instead of ending the turn. The empty response is not recorded to the database, so the user never sees the blank turn.Changes
respHasNoVisibleContent()helper and a check inprocessLLMRequest()that retries when the response contains only thinking blocksTestRetryOnThinkingOnlyResponse(integration test using a mock LLM service) andTestRespHasNoVisibleContent(unit test covering all content type combinations)Behavior
Before: LLM returns thinking-only → blank turn shown to user → user must manually retry
After: LLM returns thinking-only → loop detects no visible content → automatic retry → user sees normal response
Fixes #131