feat(messages): support adaptive thinking on /v1/messages#1555
Conversation
Add `ThinkingConfig::Adaptive { display: Option<ThinkingDisplay> }` so the
Anthropic Messages API accepts `{"type": "adaptive"}` (with optional
`{"display": "summarized" | "omitted"}`). Previously SMG rejected the
payload at the ValidatedJson extractor with `unknown variant 'adaptive'`,
forcing callers to maintain a wrapper proxy.
Adaptive mode is required on Claude Opus 4.7 and is the recommended mode on
Opus 4.6 / Sonnet 4.6. See
https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking.
In the gRPC router path, `Adaptive` is treated the same as `Enabled` for
template kwargs (`enable_thinking`/`thinking` = true) and for reasoning
parser gating; the model itself decides whether to actually emit thinking
content. The HTTP router proxies the request body byte-for-byte to upstream.
Signed-off-by: Chang Su <8605658+CatherineSue@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a ChangesAdaptive Thinking Mode Support
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Clean, focused PR. All 5 ThinkingConfig match sites in the gRPC router are updated consistently (processor.rs ×2, streaming.rs ×2, message_utils.rs ×1). The #[serde_with::skip_serializing_none] on the enum is correct — it only affects Option fields, which only exist in the new Adaptive variant. HTTP router needs no changes since it proxies byte-for-byte. Tests cover deserialization (minimal + with display), round-trip serialization (confirms no spurious "display":null), and backward compatibility of existing variants.
Anthropic's API also accepts `display: "summarized" | "omitted"` on
`{"type": "enabled"}`, not just `"adaptive"`. Without this field on the
`Enabled` variant, `display` is silently dropped during serde round-trip
and the upstream worker never sees it — the same silent-loss bug we just
fixed for adaptive.
Existing pattern matches all use `Enabled { .. }`, so this is non-breaking
for them. Only one test-internal `matches!` pattern needed `display: None`
added to compile.
Signed-off-by: Chang Su <8605658+CatherineSue@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request introduces support for adaptive thinking configuration. It adds an Adaptive variant to the ThinkingConfig enum and a new ThinkingDisplay enum to control how thinking content is returned in API responses. It also updates the gRPC response processors, streaming processors, and message utilities to treat the Adaptive configuration as having thinking enabled, and includes corresponding unit tests. There are no review comments to address, and I have no additional feedback to provide.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a120e0b1d2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| let user_thinking = match &messages_request.thinking { | ||
| Some(messages::ThinkingConfig::Enabled { .. }) => Some(true), | ||
| Some( | ||
| messages::ThinkingConfig::Enabled { .. } | ||
| | messages::ThinkingConfig::Adaptive { .. }, | ||
| ) => Some(true), |
There was a problem hiding this comment.
Avoid forcing adaptive requests into reasoning mode
Treating adaptive as user_thinking = Some(true) in this path can misclassify normal completions as reasoning when the model chooses not to emit think delimiters. In that case mark_reasoning_started() is applied and detect_and_parse_reasoning can move the whole answer into a Thinking block, leaving no normal text/tool content, which breaks non-stream /v1/messages responses for models with a reasoning parser.
Useful? React with 👍 / 👎.
| let user_thinking = match &original_request.thinking { | ||
| Some(messages::ThinkingConfig::Enabled { .. }) => Some(true), | ||
| Some( | ||
| messages::ThinkingConfig::Enabled { .. } | ||
| | messages::ThinkingConfig::Adaptive { .. }, | ||
| ) => Some(true), |
There was a problem hiding this comment.
Prevent adaptive mode from hijacking streamed text as thinking
This maps adaptive to Some(true), which enables thinking_override and causes the streaming reasoning parser to start in reasoning mode even when the model emits plain text. For adaptive requests where no closing reasoning token appears, chunks are emitted as thinking_delta instead of normal text/tool deltas, corrupting streamed Messages output.
Useful? React with 👍 / 👎.
Description
Problem
The Anthropic Messages API endpoint (
/v1/messages) rejects payloads withthinking: {"type": "adaptive"}at theValidatedJsonextractor withunknown variant 'adaptive', expected 'enabled' or 'disabled'. Adaptive thinking is required on Claude Opus 4.7 and recommended on Opus 4.6 / Sonnet 4.6 (see docs), so users are stuck running a wrapper proxy to rewrite the request before it hits SMG.A related silent-data-loss bug: Anthropic also accepts a
displayfield on{"type":"enabled",...}, but since theEnabledvariant didn't model it, the field was silently dropped during serde round-trip and the upstream worker never saw it.Solution
Add
ThinkingConfig::Adaptive { display: Option<ThinkingDisplay> }and add the samedisplay: Option<ThinkingDisplay>field toThinkingConfig::Enabled. In the gRPC router path,Adaptiveis treated the same asEnabledfor chat-template kwargs (enable_thinking/thinking) and for reasoning-parser gating — the model itself decides whether to actually emit thinking content. The HTTP router proxies the request body byte-for-byte, so no special handling is needed there.Changes
crates/protocols/src/messages.rs:ThinkingDisplayenum (Summarized/Omitted).ThinkingConfig::Adaptive { display }variant.display: Option<ThinkingDisplay>field toThinkingConfig::Enabled.displayfield, no-displayround-trip).model_gateway/src/routers/grpc/utils/message_utils.rs: includeAdaptivein the "thinking on" arm of the template kwargs match.model_gateway/src/routers/grpc/regular/processor.rs(2 sites): includeAdaptiveinseparate_reasoningmatches!anduser_thinkingmatch.model_gateway/src/routers/grpc/regular/streaming.rs(2 sites): same pair of updates for the streaming path.All existing pattern matches on
Enabledalready use{ .. }, so addingdisplayto the variant is non-breaking for them.Test Plan
Before (current
main):After (this PR):
Reproduction with the previously-failing payload:
Checklist
cargo +nightly fmtpassescargo clippy --all-targets --all-features -- -D warningspasses (foropenai-protocolandsmg)Follow-ups (separate PRs)
CreateMessageRequest(e.g.output_config,context_management) — currently silently dropped during serde round-trip, which means the upstream worker never sees them. Idiomatic fix: add#[serde(flatten)] pub extra: HashMap<String, Value>, matching the existing pattern onInputSchema.Summary by CodeRabbit
New Features
Improvements
Tests