Description
The SSE stream drainer in crates/zeph-llm/src/sse.rs accumulates tool-call JSON input and thinking text without any size bound:
state.tool_block (line 322): json.push_str(&delta.partial_json) — no cap
state.thinking_block (line 333): t.push_str(&delta.thinking) — no cap
By contrast, the compaction buffer has an explicit 32 KiB cap (MAX_COMPACTION_BUF = 32 * 1024, lines 170 and 309) with a warning log and excess discard.
A malicious or pathological server response could stream an unbounded input_json_delta or thinking_delta sequence, causing unbounded heap growth.
For compaction buffers this is already protected. The same protection should be applied to tool-call JSON and thinking accumulation buffers.
Expected Behavior
Add a MAX_TOOL_JSON_BUF (e.g., 4 MiB) and MAX_THINKING_BUF (e.g., 1 MiB) constants with the same guard/discard pattern used for the compaction buffer.
Actual Behavior
input_json_delta and thinking_delta accumulate without bound.
Environment
Logs / Evidence
// crates/zeph-llm/src/sse.rs:322
if let Some((_, _, _, ref mut json)) = state.tool_block {
json.push_str(&delta.partial_json); // no cap
}
// line 333
if let Some((ref mut t, _)) = state.thinking_block {
t.push_str(&delta.thinking); // no cap
}
Compare with capped compaction at line 170.
Description
The SSE stream drainer in
crates/zeph-llm/src/sse.rsaccumulates tool-call JSON input and thinking text without any size bound:state.tool_block(line 322):json.push_str(&delta.partial_json)— no capstate.thinking_block(line 333):t.push_str(&delta.thinking)— no capBy contrast, the compaction buffer has an explicit 32 KiB cap (
MAX_COMPACTION_BUF = 32 * 1024, lines 170 and 309) with a warning log and excess discard.A malicious or pathological server response could stream an unbounded
input_json_deltaorthinking_deltasequence, causing unbounded heap growth.For compaction buffers this is already protected. The same protection should be applied to tool-call JSON and thinking accumulation buffers.
Expected Behavior
Add a
MAX_TOOL_JSON_BUF(e.g., 4 MiB) andMAX_THINKING_BUF(e.g., 1 MiB) constants with the same guard/discard pattern used for the compaction buffer.Actual Behavior
input_json_deltaandthinking_deltaaccumulate without bound.Environment
Logs / Evidence
Compare with capped compaction at line 170.