feat: auto-compact and retry on context window errors#3217
Open
TheArchitectit wants to merge 2 commits into
Open
feat: auto-compact and retry on context window errors#3217TheArchitectit wants to merge 2 commits into
TheArchitectit wants to merge 2 commits into
Conversation
When the model API returns a context window exceeded error, the CLI now
automatically compacts the session to free up token budget, then retries
the failed turn. This prevents users from hitting a hard stop when
sessions grow too long.
Problem:
Previously, auto-compact retry only worked in the interactive REPL path
(run_turn). The non-interactive paths (run_prompt_json,
run_prompt_compact, run_prompt_compact_json) simply propagated the
error with a result? and no retry. Additionally, context window
detection used ad-hoc string matching (contains("context_window") ||
contains("no parseable body")) instead of the canonical detection
method in the api crate.
Solution:
1. Added "no parseable body" to CONTEXT_WINDOW_ERROR_MARKERS in the api
crate, so is_context_window_failure() now covers OpenAI-compat
backends that return 400 with an un-parseable body when the request
exceeds context limits.
2. Added RuntimeError::is_context_window_failure() method in the
runtime crate. Since ApiError is erased into a string message when
it crosses the runtime boundary, we need a runtime-level marker
check that mirrors the api crate's detection. This replaces the
ad-hoc string matching that was inlined in run_turn().
3. Extracted the auto-compact retry logic from run_turn() into a
shared LiveCli::auto_compact_retry() method. This method:
- Detects context window errors via RuntimeError::is_context_window_failure()
- Compacts progressively (preserve 4 -> 2 -> 0 recent messages)
- Retries the same user input with the compacted session
- Is bounded by MAX_COMPACT_RETRIES = 3 to prevent infinite loops
- Logs user-facing messages like "Context limit reached, auto-compacting
session... (attempt N/3)"
4. Extended auto-compact retry to ALL turn execution paths:
- run_turn() (interactive REPL) — now uses shared helper
- run_prompt_compact() (-p --compact) — auto-retry added
- run_prompt_compact_json() (-p --compact --json) — auto-retry added
- run_prompt_json() (-p --json) — auto-retry added
Changes:
- rust/crates/api/src/error.rs: Added "no parseable body" marker
- rust/crates/runtime/src/conversation.rs: Added
RUNTIME_CONTEXT_WINDOW_MARKERS constant and
RuntimeError::is_context_window_failure() method
- rust/crates/rusty-claude-cli/src/main.rs: Extracted
LiveCli::auto_compact_retry() with MAX_COMPACT_RETRIES = 3, replaced
inline retry logic in run_turn(), added auto-compact retry to
run_prompt_compact(), run_prompt_compact_json(), run_prompt_json()
|
Nice improvement — extending auto-compact retry from REPL-only to all execution paths (prompt, compact, json variants) fixes a real UX pain point. Using RuntimeError::is_context_window_failure() via the canonical api crate markers is the right approach, much cleaner than ad-hoc string matching. The progressive compaction strategy (4→2→0) with MAX_COMPACT_RETRIES=3 is well-bounded.\n\nOne note: I see cargo fmt is failing in CI — consider running cargo fmt --all before merge to keep checks green. |
Fixes cargo fmt CI check failure noted in reviewer feedback on ultraworkers#3217.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When the model API returns a context window exceeded error, only the interactive REPL path (
run_turn()) had auto-compact retry logic. The non-interactive paths (run_prompt_json,run_prompt_compact,run_prompt_compact_json) simply propagated the error withresult?and no retry, causing a hard stop for users runningclaw -porclaw -p --jsoncommands.Additionally, context window detection in the existing
run_turn()used ad-hoc string matching (contains("context_window") || contains("no parseable body")) instead of the canonicalis_context_window_failure()method from the api crate, and the "no parseable body" marker (added for OpenAI-compat backends in PR #3214) was missing fromCONTEXT_WINDOW_ERROR_MARKERS.Solution
Added "no parseable body" to
CONTEXT_WINDOW_ERROR_MARKERSin the api crate, sois_context_window_failure()now covers OpenAI-compat backends that return 400 with an un-parseable body.Added
RuntimeError::is_context_window_failure()in the runtime crate. SinceApiErroris erased into a string message at the runtime boundary, we need a runtime-level marker check that mirrors the api crate's detection. This replaces the ad-hoc inline string matching.Extracted auto-compact retry into
LiveCli::auto_compact_retry()-- a shared helper method that:RuntimeError::is_context_window_failure()MAX_COMPACT_RETRIES = 3to prevent infinite loopsExtended auto-compact retry to ALL turn execution paths:
run_turn()(interactive REPL) -- now uses shared helperrun_prompt_compact()(-p --compact) -- auto-retry addedrun_prompt_compact_json()(-p --compact --json) -- auto-retry addedrun_prompt_json()(-p --json) -- auto-retry addedChanges
rust/crates/api/src/error.rs: Added"no parseable body"toCONTEXT_WINDOW_ERROR_MARKERSrust/crates/runtime/src/conversation.rs: AddedRUNTIME_CONTEXT_WINDOW_MARKERSconstant andRuntimeError::is_context_window_failure()methodrust/crates/rusty-claude-cli/src/main.rs: AddedLiveCli::MAX_COMPACT_RETRIES = 3constant,LiveCli::auto_compact_retry()method; replaced inline retry logic inrun_turn()with delegation to the shared helper; added auto-compact retry torun_prompt_compact(),run_prompt_compact_json(),run_prompt_json()Diff Verification
The 127 deletions in main.rs are the inline retry code being extracted into
auto_compact_retry()(refactor, not revert). No upstream commits were lost.Testing
cargo test -p api -p runtime-- all passcargo check --workspace-- clean (no warnings from modified code)cargo fmt -p api -p runtime -p tools-- no changes neededManual testing scenario:
clawin a session with many messages until context limit is hitclaw -p "long prompt"andclaw -p --json "long prompt"as wellRelated: Builds on error detection from PR #3015 and the "no parseable body" marker from PR #3214.