Skip to content

Retry automatically when LLM returns only thinking blocks#177

Open
okottorika wants to merge 1 commit intoboldsoftware:mainfrom
okottorika:fix-thinking-only-retry
Open

Retry automatically when LLM returns only thinking blocks#177
okottorika wants to merge 1 commit intoboldsoftware:mainfrom
okottorika:fix-thinking-only-retry

Conversation

@okottorika
Copy link
Copy Markdown

@okottorika okottorika commented Apr 6, 2026

Problem

When Claude models intermittently return a response containing only thinking blocks with no text or tool_use content, the UI shows a blank turn to the user. The model produces a thinking block (indicating it understood the request) but generates zero output tokens beyond the end-of-turn token.

This was reported in the Discord bug channel where message 246 in a conversation received no visible response from the agent. A simple "Continue" retry from the user produced a proper 448-token response, confirming this is intermittent LLM-level behavior rather than a Shelley backend bug.

Solution

Add detection in the conversation loop (loop/loop.go) for responses that contain only thinking/redacted_thinking blocks with no visible content. When detected, automatically retry the LLM request instead of ending the turn. The empty response is not recorded to the database, so the user never sees the blank turn.

Changes

  • loop/loop.go: Added respHasNoVisibleContent() helper and a check in processLLMRequest() that retries when the response contains only thinking blocks
  • loop/loop_test.go: Added TestRetryOnThinkingOnlyResponse (integration test using a mock LLM service) and TestRespHasNoVisibleContent (unit test covering all content type combinations)

Behavior

Before: LLM returns thinking-only → blank turn shown to user → user must manually retry

After: LLM returns thinking-only → loop detects no visible content → automatic retry → user sees normal response

Fixes #131

@cla-bot cla-bot bot added the cla-signed label Apr 6, 2026
@okottorika okottorika force-pushed the fix-thinking-only-retry branch 2 times, most recently from b262a9c to 257ddce Compare April 6, 2026 17:21
When Claude models intermittently return a response containing only
thinking blocks with no text or tool_use content, the UI shows a blank
turn to the user. This happens rarely but is confusing.

Add detection for this case in the conversation loop: when a response
contains only thinking/redacted_thinking blocks, automatically retry
the LLM request instead of ending the turn. The empty response is not
recorded to the database, so the user never sees it.

Fixes boldsoftware#131

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
@okottorika okottorika force-pushed the fix-thinking-only-retry branch from 257ddce to 7e28c09 Compare April 6, 2026 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle streaming + retries when LLM returns only thinking block

1 participant