Skip to content

feat: auto-compact and retry on context window errors#3217

Open
TheArchitectit wants to merge 2 commits into
ultraworkers:mainfrom
TheArchitectit:worktree-auto-compact-retry
Open

feat: auto-compact and retry on context window errors#3217
TheArchitectit wants to merge 2 commits into
ultraworkers:mainfrom
TheArchitectit:worktree-auto-compact-retry

Conversation

@TheArchitectit
Copy link
Copy Markdown
Contributor

Problem

When the model API returns a context window exceeded error, only the interactive REPL path (run_turn()) had auto-compact retry logic. The non-interactive paths (run_prompt_json, run_prompt_compact, run_prompt_compact_json) simply propagated the error with result? and no retry, causing a hard stop for users running claw -p or claw -p --json commands.

Additionally, context window detection in the existing run_turn() used ad-hoc string matching (contains("context_window") || contains("no parseable body")) instead of the canonical is_context_window_failure() method from the api crate, and the "no parseable body" marker (added for OpenAI-compat backends in PR #3214) was missing from CONTEXT_WINDOW_ERROR_MARKERS.

Solution

  1. Added "no parseable body" to CONTEXT_WINDOW_ERROR_MARKERS in the api crate, so is_context_window_failure() now covers OpenAI-compat backends that return 400 with an un-parseable body.

  2. Added RuntimeError::is_context_window_failure() in the runtime crate. Since ApiError is erased into a string message at the runtime boundary, we need a runtime-level marker check that mirrors the api crate's detection. This replaces the ad-hoc inline string matching.

  3. Extracted auto-compact retry into LiveCli::auto_compact_retry() -- a shared helper method that:

    • Detects context window errors via RuntimeError::is_context_window_failure()
    • Compacts progressively (preserve 4 -> 2 -> 0 recent messages)
    • Retries the same user input with the compacted session
    • Is bounded by MAX_COMPACT_RETRIES = 3 to prevent infinite loops
    • Logs user-facing messages: "Context limit reached, auto-compacting session... (attempt N/3)"
  4. Extended auto-compact retry to ALL turn execution paths:

    • run_turn() (interactive REPL) -- now uses shared helper
    • run_prompt_compact() (-p --compact) -- auto-retry added
    • run_prompt_compact_json() (-p --compact --json) -- auto-retry added
    • run_prompt_json() (-p --json) -- auto-retry added

Changes

  • rust/crates/api/src/error.rs: Added "no parseable body" to CONTEXT_WINDOW_ERROR_MARKERS
  • rust/crates/runtime/src/conversation.rs: Added RUNTIME_CONTEXT_WINDOW_MARKERS constant and RuntimeError::is_context_window_failure() method
  • rust/crates/rusty-claude-cli/src/main.rs: Added LiveCli::MAX_COMPACT_RETRIES = 3 constant, LiveCli::auto_compact_retry() method; replaced inline retry logic in run_turn() with delegation to the shared helper; added auto-compact retry to run_prompt_compact(), run_prompt_compact_json(), run_prompt_json()

Diff Verification

rust/crates/api/src/error.rs             |   1 +
rust/crates/runtime/src/conversation.rs  |  37 +++++
rust/crates/rusty-claude-cli/src/main.rs | 273 +++++++++++++++++--------------
3 files changed, 184 insertions(+), 127 deletions(-)

The 127 deletions in main.rs are the inline retry code being extracted into auto_compact_retry() (refactor, not revert). No upstream commits were lost.

Testing

  1. Unit tests: cargo test -p api -p runtime -- all pass
  2. Compile check: cargo check --workspace -- clean (no warnings from modified code)
  3. Format: cargo fmt -p api -p runtime -p tools -- no changes needed

Manual testing scenario:

  • Run claw in a session with many messages until context limit is hit
  • Observe auto-compact message: "Context limit reached, auto-compacting session... (attempt 1/3)"
  • Session should automatically compact and retry the turn without user intervention
  • Test with claw -p "long prompt" and claw -p --json "long prompt" as well

Related: Builds on error detection from PR #3015 and the "no parseable body" marker from PR #3214.

When the model API returns a context window exceeded error, the CLI now
automatically compacts the session to free up token budget, then retries
the failed turn. This prevents users from hitting a hard stop when
sessions grow too long.

Problem:
Previously, auto-compact retry only worked in the interactive REPL path
(run_turn). The non-interactive paths (run_prompt_json,
run_prompt_compact, run_prompt_compact_json) simply propagated the
error with a result? and no retry. Additionally, context window
detection used ad-hoc string matching (contains("context_window") ||
contains("no parseable body")) instead of the canonical detection
method in the api crate.

Solution:
1. Added "no parseable body" to CONTEXT_WINDOW_ERROR_MARKERS in the api
   crate, so is_context_window_failure() now covers OpenAI-compat
   backends that return 400 with an un-parseable body when the request
   exceeds context limits.

2. Added RuntimeError::is_context_window_failure() method in the
   runtime crate. Since ApiError is erased into a string message when
   it crosses the runtime boundary, we need a runtime-level marker
   check that mirrors the api crate's detection. This replaces the
   ad-hoc string matching that was inlined in run_turn().

3. Extracted the auto-compact retry logic from run_turn() into a
   shared LiveCli::auto_compact_retry() method. This method:
   - Detects context window errors via RuntimeError::is_context_window_failure()
   - Compacts progressively (preserve 4 -> 2 -> 0 recent messages)
   - Retries the same user input with the compacted session
   - Is bounded by MAX_COMPACT_RETRIES = 3 to prevent infinite loops
   - Logs user-facing messages like "Context limit reached, auto-compacting
     session... (attempt N/3)"

4. Extended auto-compact retry to ALL turn execution paths:
   - run_turn() (interactive REPL) — now uses shared helper
   - run_prompt_compact() (-p --compact) — auto-retry added
   - run_prompt_compact_json() (-p --compact --json) — auto-retry added
   - run_prompt_json() (-p --json) — auto-retry added

Changes:
- rust/crates/api/src/error.rs: Added "no parseable body" marker
- rust/crates/runtime/src/conversation.rs: Added
  RUNTIME_CONTEXT_WINDOW_MARKERS constant and
  RuntimeError::is_context_window_failure() method
- rust/crates/rusty-claude-cli/src/main.rs: Extracted
  LiveCli::auto_compact_retry() with MAX_COMPACT_RETRIES = 3, replaced
  inline retry logic in run_turn(), added auto-compact retry to
  run_prompt_compact(), run_prompt_compact_json(), run_prompt_json()
@1716775457damn
Copy link
Copy Markdown

Nice improvement — extending auto-compact retry from REPL-only to all execution paths (prompt, compact, json variants) fixes a real UX pain point. Using RuntimeError::is_context_window_failure() via the canonical api crate markers is the right approach, much cleaner than ad-hoc string matching. The progressive compaction strategy (4→2→0) with MAX_COMPACT_RETRIES=3 is well-bounded.\n\nOne note: I see cargo fmt is failing in CI — consider running cargo fmt --all before merge to keep checks green.

Fixes cargo fmt CI check failure noted in reviewer feedback on ultraworkers#3217.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants