Skip to content

feat(messages): support adaptive thinking on /v1/messages#1555

Merged
slin1237 merged 2 commits into
mainfrom
claude/ecstatic-liskov-a2d0e1
May 27, 2026
Merged

feat(messages): support adaptive thinking on /v1/messages#1555
slin1237 merged 2 commits into
mainfrom
claude/ecstatic-liskov-a2d0e1

Conversation

@CatherineSue
Copy link
Copy Markdown
Member

@CatherineSue CatherineSue commented May 27, 2026

Description

Problem

The Anthropic Messages API endpoint (/v1/messages) rejects payloads with thinking: {"type": "adaptive"} at the ValidatedJson extractor with unknown variant 'adaptive', expected 'enabled' or 'disabled'. Adaptive thinking is required on Claude Opus 4.7 and recommended on Opus 4.6 / Sonnet 4.6 (see docs), so users are stuck running a wrapper proxy to rewrite the request before it hits SMG.

A related silent-data-loss bug: Anthropic also accepts a display field on {"type":"enabled",...}, but since the Enabled variant didn't model it, the field was silently dropped during serde round-trip and the upstream worker never saw it.

Solution

Add ThinkingConfig::Adaptive { display: Option<ThinkingDisplay> } and add the same display: Option<ThinkingDisplay> field to ThinkingConfig::Enabled. In the gRPC router path, Adaptive is treated the same as Enabled for chat-template kwargs (enable_thinking/thinking) and for reasoning-parser gating — the model itself decides whether to actually emit thinking content. The HTTP router proxies the request body byte-for-byte, so no special handling is needed there.

Changes

  • crates/protocols/src/messages.rs:
    • Add ThinkingDisplay enum (Summarized / Omitted).
    • Add ThinkingConfig::Adaptive { display } variant.
    • Add display: Option<ThinkingDisplay> field to ThinkingConfig::Enabled.
    • 6 deserialize tests covering both variants × (minimal form, display field, no-display round-trip).
  • model_gateway/src/routers/grpc/utils/message_utils.rs: include Adaptive in the "thinking on" arm of the template kwargs match.
  • model_gateway/src/routers/grpc/regular/processor.rs (2 sites): include Adaptive in separate_reasoning matches! and user_thinking match.
  • model_gateway/src/routers/grpc/regular/streaming.rs (2 sites): same pair of updates for the streaming path.

All existing pattern matches on Enabled already use { .. }, so adding display to the variant is non-breaking for them.

Test Plan

Before (current main):

$ echo '{"type":"adaptive"}' | serde_json::from_str::<ThinkingConfig>
REJECTED: unknown variant `adaptive`, expected `enabled` or `disabled`

After (this PR):

$ cargo test -p openai-protocol --lib messages
test messages::tests::test_thinking_config_adaptive_minimal ... ok
test messages::tests::test_thinking_config_adaptive_with_display ... ok
test messages::tests::test_thinking_config_adaptive_round_trip_omits_null_display ... ok
test messages::tests::test_thinking_config_enabled_with_display ... ok
test messages::tests::test_thinking_config_enabled_round_trip_omits_null_display ... ok
test messages::tests::test_thinking_config_existing_variants_still_work ... ok
test result: ok. 23 passed; 0 failed

Reproduction with the previously-failing payload:

let payload = r#"{"thinking":{"type":"adaptive"}, ...}"#;
let req: CreateMessageRequest = serde_json::from_str(payload).unwrap();
// thinking = Some(Adaptive { display: None })
// re-serialized: {"type":"adaptive"}   ← no spurious "display":null
Checklist
  • cargo +nightly fmt passes
  • cargo clippy --all-targets --all-features -- -D warnings passes (for openai-protocol and smg)
  • (Optional) Documentation updated
  • (Optional) Please join us on Slack #sig-smg to discuss, review, and merge PRs

Follow-ups (separate PRs)

  • Preserve unknown top-level fields on CreateMessageRequest (e.g. output_config, context_management) — currently silently dropped during serde round-trip, which means the upstream worker never sees them. Idiomatic fix: add #[serde(flatten)] pub extra: HashMap<String, Value>, matching the existing pattern on InputSchema.

Summary by CodeRabbit

  • New Features

    • Added adaptive thinking mode with configurable display options (omitted or summarized).
  • Improvements

    • Adaptive mode is now handled consistently with enabled mode across reasoning and streaming paths, ensuring uniform behavior and overrides.
  • Tests

    • Added unit tests covering adaptive mode parsing, serialization round-trips, and continued compatibility with existing modes.

Review Change Stack

Add `ThinkingConfig::Adaptive { display: Option<ThinkingDisplay> }` so the
Anthropic Messages API accepts `{"type": "adaptive"}` (with optional
`{"display": "summarized" | "omitted"}`). Previously SMG rejected the
payload at the ValidatedJson extractor with `unknown variant 'adaptive'`,
forcing callers to maintain a wrapper proxy.

Adaptive mode is required on Claude Opus 4.7 and is the recommended mode on
Opus 4.6 / Sonnet 4.6. See
https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking.

In the gRPC router path, `Adaptive` is treated the same as `Enabled` for
template kwargs (`enable_thinking`/`thinking` = true) and for reasoning
parser gating; the model itself decides whether to actually emit thinking
content. The HTTP router proxies the request body byte-for-byte to upstream.

Signed-off-by: Chang Su <8605658+CatherineSue@users.noreply.github.com>
@github-actions github-actions Bot added grpc gRPC client and router changes protocols Protocols crate changes model-gateway Model gateway crate changes labels May 27, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: bb48c215-2cd0-4d26-9e4b-759c9f476195

📥 Commits

Reviewing files that changed from the base of the PR and between 9702e32 and a120e0b.

📒 Files selected for processing (1)
  • crates/protocols/src/messages.rs

📝 Walkthrough

Walkthrough

Adds a ThinkingConfig::Adaptive variant and optional display to the Messages protocol, and updates processor, streaming, and template codepaths to treat Adaptive like Enabled for reasoning parsing, thinking overrides, and template flags.

Changes

Adaptive Thinking Mode Support

Layer / File(s) Summary
Protocol definition, serde, and tests
crates/protocols/src/messages.rs
ThinkingConfig now has Adaptive { display: Option<ThinkingDisplay> }, Enabled includes display: Option<ThinkingDisplay>, and #[serde_with::skip_serializing_none] was added. Unit tests validate adaptive/enabled deserialization and round-trip serialization omitting display: None.
Reasoning parser activation
model_gateway/src/routers/grpc/regular/processor.rs, model_gateway/src/routers/grpc/regular/streaming.rs
separate_reasoning now returns true for ThinkingConfig::Adaptive in addition to Enabled, enabling the reasoning parser for adaptive mode in request and streaming flows.
Thinking override mapping
model_gateway/src/routers/grpc/regular/processor.rs, model_gateway/src/routers/grpc/regular/streaming.rs
user_thinking mapping now yields Some(true) for Adaptive as well as Enabled, aligning override behavior across both paths.
Template parameter mapping
model_gateway/src/routers/grpc/utils/message_utils.rs
process_messages() sets enable_thinking=true and thinking=true for both Enabled and Adaptive thinking configs so templates receive consistent flags.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • lightseekorg/smg#753: Modified Messages thinking handling in processor.rs and message_utils.rs, related to extending thinking handling to Adaptive.

Suggested labels

tests

Suggested reviewers

  • key4ng
  • slin1237

Poem

🐰 Adaptive thoughts now join the race,
A new display tucked in place,
Parsers wake and flags align,
Templates know when thinking's fine,
Tiny hop—big change, soft grace.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding support for adaptive thinking mode to the /v1/messages API endpoint, which is the primary objective of this PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/ecstatic-liskov-a2d0e1

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, focused PR. All 5 ThinkingConfig match sites in the gRPC router are updated consistently (processor.rs ×2, streaming.rs ×2, message_utils.rs ×1). The #[serde_with::skip_serializing_none] on the enum is correct — it only affects Option fields, which only exist in the new Adaptive variant. HTTP router needs no changes since it proxies byte-for-byte. Tests cover deserialization (minimal + with display), round-trip serialization (confirms no spurious "display":null), and backward compatibility of existing variants.

Anthropic's API also accepts `display: "summarized" | "omitted"` on
`{"type": "enabled"}`, not just `"adaptive"`. Without this field on the
`Enabled` variant, `display` is silently dropped during serde round-trip
and the upstream worker never sees it — the same silent-loss bug we just
fixed for adaptive.

Existing pattern matches all use `Enabled { .. }`, so this is non-breaking
for them. Only one test-internal `matches!` pattern needed `display: None`
added to compile.

Signed-off-by: Chang Su <8605658+CatherineSue@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for adaptive thinking configuration. It adds an Adaptive variant to the ThinkingConfig enum and a new ThinkingDisplay enum to control how thinking content is returned in API responses. It also updates the gRPC response processors, streaming processors, and message utilities to treat the Adaptive configuration as having thinking enabled, and includes corresponding unit tests. There are no review comments to address, and I have no additional feedback to provide.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a120e0b1d2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 602 to +606
let user_thinking = match &messages_request.thinking {
Some(messages::ThinkingConfig::Enabled { .. }) => Some(true),
Some(
messages::ThinkingConfig::Enabled { .. }
| messages::ThinkingConfig::Adaptive { .. },
) => Some(true),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid forcing adaptive requests into reasoning mode

Treating adaptive as user_thinking = Some(true) in this path can misclassify normal completions as reasoning when the model chooses not to emit think delimiters. In that case mark_reasoning_started() is applied and detect_and_parse_reasoning can move the whole answer into a Thinking block, leaving no normal text/tool content, which breaks non-stream /v1/messages responses for models with a reasoning parser.

Useful? React with 👍 / 👎.

Comment on lines 1622 to +1626
let user_thinking = match &original_request.thinking {
Some(messages::ThinkingConfig::Enabled { .. }) => Some(true),
Some(
messages::ThinkingConfig::Enabled { .. }
| messages::ThinkingConfig::Adaptive { .. },
) => Some(true),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Prevent adaptive mode from hijacking streamed text as thinking

This maps adaptive to Some(true), which enables thinking_override and causes the streaming reasoning parser to start in reasoning mode even when the model emits plain text. For adaptive requests where no closing reasoning token appears, chunks are emitted as thinking_delta instead of normal text/tool deltas, corrupting streamed Messages output.

Useful? React with 👍 / 👎.

@slin1237 slin1237 merged commit ef782fe into main May 27, 2026
47 of 54 checks passed
@slin1237 slin1237 deleted the claude/ecstatic-liskov-a2d0e1 branch May 27, 2026 01:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

grpc gRPC client and router changes model-gateway Model gateway crate changes protocols Protocols crate changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants