context truncation/compaction improvements by krissetto · Pull Request #1831 · docker/cagent

krissetto · 2026-02-23T23:17:07Z

Removes truncateOldToolContent and MaxToolCallTokens from session to avoid busting cache unnecessarily and potentially confusing models
Preserves assistant text as a separate message item before function_call items in convertMessagesToResponseInput (responses API)
Lowers default context limit before compaction to 80% of model's context length. Anything after 50% usually sees progressively bigger drops in output quality so 80% seems a good point to compact at

These are opinionated changes, things seem to perform generally better if we let the caching do its job and don't edit/remove things from the history.
Lets do some tests and see how we feel about them

- Removes truncateOldToolContent and MaxToolCallTokens from session to avoid busting cache unnecessarily and potentially confusing models - Preserves assistant text as a separate message item before function_call items in convertMessagesToResponseInput (responses API) - Lowers default context limit before compaction to 80% of model's context length, anything after 50% usually sees pregressively bigger drops in output quality Signed-off-by: Christopher Petito <chrisjpetito@gmail.com>

rumpl · 2026-02-24T14:34:20Z

Preserves assistant text as a separate message item before function_call items in convertMessagesToResponseInput (responses API)

Why?

rumpl · 2026-02-24T14:36:11Z

The same way we can set the max number of messages to keep in the context, we should maybe make this configurable?

krissetto · 2026-02-24T16:25:54Z

Preserves assistant text as a separate message item before function_call items in convertMessagesToResponseInput (responses API)

Why?

why not? some models can say something together with their tools calls, we were dropping it entirely which just seems wrong to me

The same way we can set the max number of messages to keep in the context, we should maybe make this configurable?

agree. i think the default should only be compaction, not thread truncation, to also to align with what are likely to be most users' expectations (aka "don't mess with my messages unless you really have to").
but being able to configure it is good imho

I've been giving this a ride and honestly it feels quite a bit better to me, especially with openai models. Curious to know if you noticed any issues etc, even cost related ones (not busting cache helps with costs too in many scenarios)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

context truncation/compaction improvements#1831

context truncation/compaction improvements#1831
krissetto wants to merge 1 commit intodocker:mainfrom
krissetto:better-cache-stability-and-context-preservation

krissetto commented Feb 23, 2026

Uh oh!

rumpl commented Feb 24, 2026

Uh oh!

rumpl commented Feb 24, 2026

Uh oh!

krissetto commented Feb 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

krissetto commented Feb 23, 2026

Uh oh!

rumpl commented Feb 24, 2026

Uh oh!

rumpl commented Feb 24, 2026

Uh oh!

krissetto commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

krissetto commented Feb 24, 2026 •

edited

Loading