fix: vision_load crash recovery and robust image message handling by hurtdidit · Pull Request #1020 · agent0ai/agent-zero

hurtdidit · 2026-02-11T15:16:14Z

Summary

Fixes a critical bug where the vision_load tool crashes chat sessions permanently when a provider rejects image content (HTTP 400). Once crashed, the chat cannot recover — every subsequent message re-triggers the same error because the raw image data persists in serialized history.

This 4-part fix adds defensive handling at multiple layers to prevent crashes and enable self-healing recovery.

Problem

Root Cause

When vision_load processes an image, it creates a RawMessage with base64-encoded image data in OpenAI multimodal format (image_url content blocks). This works fine for providers that support it natively, but fails catastrophically when:

The provider rejects the image format (HTTP 400)
The image is too large or in an unsupported format
An OpenAI-compatible proxy routing to a non-OpenAI model doesn't translate the image format correctly

Crash Sequence

vision_load adds RawMessage with base64 image to chat history
    ↓
Next monologue loop: call_chat_model() sends entire history to LLM
    ↓
Provider returns HTTP 400 ("Bad Request" / format rejection)
    ↓  
retry_critical_exception() retries with SAME history → SAME 400
    ↓
handle_critical_exception() kills the monologue loop
    ↓
Image data persists in history JSON on disk
    ↓
Next user message → new monologue → sends same history → same 400
    ↓
💥 PERMANENT CRASH LOOP — chat is unrecoverable

Additional Issues Found

summarize_messages() (history.py): Had a FIXME noting that vision bytes get sent to the utility LLM during context compression. While output_text() returns preview strings for RawMessage, there was no explicit guard.
_merge_outputs() (history.py): When group_messages_abab() merges consecutive same-role messages, RawMessage dicts can get mixed with text content, creating invalid message structures.

Fix (4 Parts)

Part 1: Explicit raw message handling in `summarize_messages()` (history.py)

Replaced the FIXME with proper handling
Before building text for the utility model, checks _is_raw_message() on each message
Raw messages are replaced with their preview string (e.g., "<non-text content>")

Part 2: Safe message merging in `_merge_outputs()` (history.py)

Added guard at the top: if either operand is a RawMessage, convert to text preview via _stringify_content() before merging
Prevents group_messages_abab() from creating invalid mixed-format message structures

Part 3: Error recovery in `vision_load.py`

Wrapped hist_add_message() call in try/except
On failure: logs error and adds a descriptive text fallback via hist_add_tool_result()
Prevents corrupted image data from persisting if the history add itself fails

Part 4: Self-healing crash loop recovery in `agent.py` ⭐

This is the critical fix that breaks the permanent crash loop.

Added _strip_raw_images_from_history() method that scans history.current.messages and history.topics for RawMessage content, replacing each with a text preview via msg.set_summary()
Modified both exception handlers in monologue() (inner message loop + outer monologue loop):
- On BadRequestError / 400 status → calls _strip_raw_images_from_history()
- If images were stripped → continue (retry without bad images)
- If no images found → falls through to existing retry_critical_exception() logic

Files Changed

File	Changes	Description
`python/helpers/history.py`	~+23	Parts 1 & 2: summarize_messages guard, _merge_outputs guard
`python/tools/vision_load.py`	~+21/-5	Part 3: try/except error recovery in after_execution
`agent.py`	~+58	Part 4: _strip_raw_images_from_history + exception handler guards

Testing

Test Case	Result
Fresh chat + image load	✅ Works normally
Image that triggers provider 400	✅ Part 4 catches error, strips image, chat continues
Previously crashed/stuck chat	✅ Recovered — user can continue chatting
Long conversation after image (compression trigger)	✅ Parts 1 & 2 handle summarization/merging safely
Multiple images in conversation	✅ All stripped on error, chat survives

Tested with OpenAI-compatible proxy routing to a non-OpenAI model (triggers the 400 rejection).

Reproduction Steps (Without Fix)

Configure Agent Zero with an LLM provider that rejects certain image formats (e.g., OpenAI-compatible proxy to a non-OpenAI model)
Start a new chat
Use vision_load to attach an image
Observe: provider returns 400 → chat crashes
Try sending another message → same 400 → permanent crash
Restart container → same chat still crashes on any message

Backward Compatibility

No API changes
No configuration changes required
Purely additive error handling — existing functionality unchanged for providers that accept images normally
The fix gracefully degrades: if images work fine, none of these code paths activate

Fixes a critical bug where vision_load permanently crashes chat sessions when a provider rejects image content (HTTP 400). The raw image data persists in serialized history, causing every subsequent message to re-trigger the same error in an unrecoverable loop. 4-part fix: 1. history.py: Explicit raw message handling in summarize_messages() - Replaces FIXME; guards against vision bytes reaching utility LLM 2. history.py: Safe merge guard in _merge_outputs() - Prevents RawMessage/text content mixing during ABAB grouping 3. vision_load.py: Error recovery with text fallback - try/except around hist_add_message with graceful degradation 4. agent.py: Self-healing crash loop recovery (key fix) - _strip_raw_images_from_history() scans and replaces RawMessages - Both monologue exception handlers detect BadRequestError/400, strip images, and retry cleanly instead of entering crash loop Tested with OpenAI-compatible proxy routing to non-OpenAI model. Confirmed: fresh chats work, crash-triggering images recover gracefully, and previously stuck chats become usable again.

hurtdidit · 2026-02-11T15:20:02Z

I've an image that consistently causes the crash (a screenshot, I suspect characters in the file name are to blame) -- if you'd like me to send it to you for easier testing/replication, just DM me in Discord.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: vision_load crash recovery and robust image message handling#1020

fix: vision_load crash recovery and robust image message handling#1020
hurtdidit wants to merge 1 commit intoagent0ai:developmentfrom
hurtdidit:fix/vision-load-crash-recovery

hurtdidit commented Feb 11, 2026

Uh oh!

hurtdidit commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hurtdidit commented Feb 11, 2026

Summary

Problem

Root Cause

Crash Sequence

Additional Issues Found

Fix (4 Parts)

Part 1: Explicit raw message handling in summarize_messages() (history.py)

Part 2: Safe message merging in _merge_outputs() (history.py)

Part 3: Error recovery in vision_load.py

Part 4: Self-healing crash loop recovery in agent.py ⭐

Files Changed

Testing

Reproduction Steps (Without Fix)

Backward Compatibility

Related

Uh oh!

hurtdidit commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Part 1: Explicit raw message handling in `summarize_messages()` (history.py)

Part 2: Safe message merging in `_merge_outputs()` (history.py)

Part 3: Error recovery in `vision_load.py`

Part 4: Self-healing crash loop recovery in `agent.py` ⭐