fix: vision_load crash recovery and robust image message handling#1020
Open
hurtdidit wants to merge 1 commit intoagent0ai:developmentfrom
Open
fix: vision_load crash recovery and robust image message handling#1020hurtdidit wants to merge 1 commit intoagent0ai:developmentfrom
hurtdidit wants to merge 1 commit intoagent0ai:developmentfrom
Conversation
Fixes a critical bug where vision_load permanently crashes chat sessions
when a provider rejects image content (HTTP 400). The raw image data
persists in serialized history, causing every subsequent message to
re-trigger the same error in an unrecoverable loop.
4-part fix:
1. history.py: Explicit raw message handling in summarize_messages()
- Replaces FIXME; guards against vision bytes reaching utility LLM
2. history.py: Safe merge guard in _merge_outputs()
- Prevents RawMessage/text content mixing during ABAB grouping
3. vision_load.py: Error recovery with text fallback
- try/except around hist_add_message with graceful degradation
4. agent.py: Self-healing crash loop recovery (key fix)
- _strip_raw_images_from_history() scans and replaces RawMessages
- Both monologue exception handlers detect BadRequestError/400,
strip images, and retry cleanly instead of entering crash loop
Tested with OpenAI-compatible proxy routing to non-OpenAI model.
Confirmed: fresh chats work, crash-triggering images recover gracefully,
and previously stuck chats become usable again.
Contributor
Author
|
I've an image that consistently causes the crash (a screenshot, I suspect characters in the file name are to blame) -- if you'd like me to send it to you for easier testing/replication, just DM me in Discord. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a critical bug where the
vision_loadtool crashes chat sessions permanently when a provider rejects image content (HTTP 400). Once crashed, the chat cannot recover — every subsequent message re-triggers the same error because the raw image data persists in serialized history.This 4-part fix adds defensive handling at multiple layers to prevent crashes and enable self-healing recovery.
Problem
Root Cause
When
vision_loadprocesses an image, it creates aRawMessagewith base64-encoded image data in OpenAI multimodal format (image_urlcontent blocks). This works fine for providers that support it natively, but fails catastrophically when:Crash Sequence
Additional Issues Found
summarize_messages()(history.py): Had a FIXME noting that vision bytes get sent to the utility LLM during context compression. Whileoutput_text()returns preview strings forRawMessage, there was no explicit guard._merge_outputs()(history.py): Whengroup_messages_abab()merges consecutive same-role messages,RawMessagedicts can get mixed with text content, creating invalid message structures.Fix (4 Parts)
Part 1: Explicit raw message handling in
summarize_messages()(history.py)_is_raw_message()on each message"<non-text content>")Part 2: Safe message merging in
_merge_outputs()(history.py)RawMessage, convert to text preview via_stringify_content()before merginggroup_messages_abab()from creating invalid mixed-format message structuresPart 3: Error recovery in
vision_load.pyhist_add_message()call in try/excepthist_add_tool_result()Part 4: Self-healing crash loop recovery in
agent.py⭐This is the critical fix that breaks the permanent crash loop.
_strip_raw_images_from_history()method that scanshistory.current.messagesandhistory.topicsforRawMessagecontent, replacing each with a text preview viamsg.set_summary()monologue()(inner message loop + outer monologue loop):BadRequestError/ 400 status → calls_strip_raw_images_from_history()continue(retry without bad images)retry_critical_exception()logicFiles Changed
python/helpers/history.pypython/tools/vision_load.pyagent.pyTesting
Tested with OpenAI-compatible proxy routing to a non-OpenAI model (triggers the 400 rejection).
Reproduction Steps (Without Fix)
vision_loadto attach an imageBackward Compatibility
Related