feat(bedrock): add top_k and extended thinking support for Claude models by mesutoezdil · Pull Request #1732 · kagent-dev/kagent

mesutoezdil · 2026-04-23T11:09:27Z

Bedrock's Converse API keeps Claude-specific parameters separate from the standard InferenceConfiguration block. Fields like top_k and the extended thinking config have to go through additionalModelRequestFields. This PR wires those two fields end-to-end so operators can turn on Claude extended thinking via the ModelConfig CRD.

Changes:

BedrockConfig CRD type gets optional topK and thinkingBudgetTokens fields. Deepcopy and CRD manifests regenerated with make manifests, Helm CRD template synced.
Go ADK Bedrock struct and BedrockModel updated. buildAdditionalModelRequestFields() builds the payload only when at least one field is set.
Python KAgentBedrockLlm gets matching top_k and thinking_budget_tokens fields and builds additionalModelRequestFields the same way.
The translator passes both fields from ModelConfig through to the ADK config.

Both fields are fully optional so existing configs are not affected.

Signed-off-by: mesutoezdil mesudozdil@gmail.com

Expose top_k and thinking_budget_tokens for Bedrock Claude models so operators can enable extended thinking mode via the ModelConfig CRD. Both fields are optional and only sent to the Converse API when set, using additionalModelRequestFields which is the correct mechanism for parameters outside the standard InferenceConfiguration block. Closes kagent-dev#1721 Signed-off-by: mesutoezdil <mesudozdil@gmail.com>

Copilot

Pull request overview

This PR adds support for Claude-specific Bedrock Converse API parameters (top_k and extended thinking via thinking.budget_tokens) and wires them through the ModelConfig CRD → translator → Go ADK → (also) Python ADK Bedrock client so operators can enable these options end-to-end.

Changes:

Extends the Bedrock section of the ModelConfig CRD with optional topK and thinkingBudgetTokens fields (and regenerates/syncs CRD manifests/templates).
Updates Go Bedrock model implementation to send these Claude-only settings via AdditionalModelRequestFields (only when set).
Updates Python KAgentBedrockLlm and the Python ADK types factory to pass and emit the same additionalModelRequestFields payload.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
python/packages/kagent-adk/src/kagent/adk/types.py	Adds `top_k` / `thinking_budget_tokens` to the Bedrock model config type and passes them into `KAgentBedrockLlm`.
python/packages/kagent-adk/src/kagent/adk/models/_bedrock.py	Builds `additionalModelRequestFields` for `top_k` and `thinking` when configured.
helm/kagent-crds/templates/kagent.dev_modelconfigs.yaml	CRD Helm template updated to include Bedrock `topK` and `thinkingBudgetTokens`.
go/core/internal/controller/translator/agent/adk_api_translator.go	Translator now passes Bedrock `TopK` and `ThinkingBudgetTokens` into the ADK Bedrock model config.
go/api/v1alpha2/zz_generated.deepcopy.go	Deepcopy updated to properly deep-copy Bedrock pointer fields and use BedrockConfig’s DeepCopyInto.
go/api/v1alpha2/modelconfig_types.go	Adds `TopK` and `ThinkingBudgetTokens` fields to the BedrockConfig API type.
go/api/config/crd/bases/kagent.dev_modelconfigs.yaml	Generated CRD base updated with the new Bedrock properties.
go/api/adk/types.go	Go ADK Bedrock model type gains `top_k` / `thinking_budget_tokens` JSON fields.
go/adk/pkg/models/bedrock.go	Adds `ThinkingBudgetTokens` to config and sends Claude-specific fields via `AdditionalModelRequestFields` in Converse/ConverseStream.
go/adk/pkg/agent/agent.go	Wires ADK Bedrock model config fields into the Bedrock runtime model config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-23T11:15:13Z

+        if self.top_k is not None:
+            additional_fields["top_k"] = self.top_k
+        if self.thinking_budget_tokens is not None:


thinking_budget_tokens is documented as requiring a minimum (>=1024), but this code will pass through any integer to Bedrock and fail later with a generic API error. Consider validating top_k (>0) and thinking_budget_tokens (>=1024) before populating additionalModelRequestFields, and extend the existing unit tests to assert the kwargs passed to client.converse/converse_stream when these options are set.

Suggested change

if self.top_k is not None:

additional_fields["top_k"] = self.top_k

if self.thinking_budget_tokens is not None:

if self.top_k is not None:

if self.top_k <= 0:

raise ValueError("top_k must be greater than 0")

additional_fields["top_k"] = self.top_k

if self.thinking_budget_tokens is not None:

if self.thinking_budget_tokens < 1024:

raise ValueError("thinking_budget_tokens must be greater than or equal to 1024")

Copilot · 2026-04-23T11:15:14Z

+	// ThinkingBudgetTokens sets the maximum number of tokens Claude may use for
+	// internal reasoning before producing its response. When set, extended thinking
+	// is enabled. Only supported by Claude models that support extended thinking.
+	// Must be at least 1024 when set.
+	// +optional


The doc comment states a hard constraint (“Must be at least 1024 when set”), but the type doesn’t enforce it via kubebuilder validation markers. Add CRD validation tags (e.g., +kubebuilder:validation:Minimum=1024 for ThinkingBudgetTokens, and a sensible minimum for TopK) so bad configs are rejected before reaching Bedrock.

Copilot · 2026-04-23T11:15:14Z

+                      ThinkingBudgetTokens sets the maximum number of tokens Claude may use for
+                      internal reasoning before producing its response. When set, extended thinking
+                      is enabled. Only supported by Claude models that support extended thinking.
+                      Must be at least 1024 when set.


The description says thinkingBudgetTokens must be at least 1024, but the CRD schema doesn’t enforce that constraint. Add an OpenAPI validation like minimum: 1024 (generated via kubebuilder tags or edited here) so invalid ModelConfigs are rejected at admission time.

Suggested change

Must be at least 1024 when set.

Must be at least 1024 when set.

minimum: 1024

Copilot · 2026-04-23T11:15:15Z

+                      ThinkingBudgetTokens sets the maximum number of tokens Claude may use for
+                      internal reasoning before producing its response. When set, extended thinking
+                      is enabled. Only supported by Claude models that support extended thinking.
+                      Must be at least 1024 when set.


The description says thinkingBudgetTokens must be at least 1024, but the Helm CRD template doesn’t enforce that constraint in the schema. Add an OpenAPI validation like minimum: 1024 (ideally via kubebuilder tags and regeneration) so bad values fail fast.

Suggested change

Must be at least 1024 when set.

Must be at least 1024 when set.

minimum: 1024

Copilot · 2026-04-23T11:15:15Z

+// buildAdditionalModelRequestFields returns a document.Interface containing
+// model-specific parameters that are not part of InferenceConfiguration.
+// For Claude on Bedrock this includes top_k and thinking configuration.
+// Returns nil when no extra fields are needed.
+func (m *BedrockModel) buildAdditionalModelRequestFields() document.Interface {


New behavior is introduced here (building AdditionalModelRequestFields for top_k / thinking), but there are no unit tests asserting when this is nil vs populated and what JSON is sent to Bedrock. Please add coverage (e.g., in go/adk/pkg/models/bedrock_test.go) for: no fields set → nil, only TopK set → {top_k: ...}, only ThinkingBudgetTokens set → {thinking: {...}}, and both set.

supreme-gg-gg

The intention of the change makes sense to me, but I want to discuss a potentially different way to structure bedrock config CRD.

In Bedrock there are two types of model parameters, one typed struct called inferenceConfig (which we use already) and an untyped JSON called additionalModelRequestFields which you are using here.

I think to stay close to Bedrock's API design intention we should make inferenceConfig values directly configurable on BedrockConfig, and other custom provider-specific headers a raw JSON. The immediate benefit I see is that this avoids adding new fields for all the (potentially conflicting) provider params and eventually having a massive config, since everyone has their different needs when using Bedrock.

I'm suggesting something similar to how OllamaConfig uses map[string]string options to allow for arbitrary model parameters, or you can perhaps use *apiextensionsv1.JSON here.

mesutoezdil · 2026-04-24T08:22:21Z

cool point on the long-term maintainability concern. Typed fields would keep growing as new Claude-specific params appear. 1 thing worth noting: map[string]string would not work here bcs the thinking config is a nested object ({"type": "enabled", "budget_tokens": 16000}), not a flat string value. Would *apiextensionsv1.JSON be the right pick for additionalModelRequestFields, or do you have a different shape in mind? Happy to rework the PR along those lines once we agree on the type.

supreme-gg-gg · 2026-04-24T16:27:16Z

*apiextensionsv1.JSON should work for nested objects since it just stores raw json bytes that can be unmarshalled. An alternative I think is runtime.RawExtension, you can pick the right one to use. You might want to add this validation rule: // +kubebuilder:pruning:PreserveUnknownFields to avoid these untyped fields from being pruned.

mesutoezdil · 2026-04-24T16:56:27Z

Going with *apiextensionsv1.JSON then. Will add +kubebuilder:pruning:PreserveUnknownFields on the field and rework the PR to replace the two typed fields with a single additionalModelRequestFields raw JSON field. Will push shortly.

…elds Replace the typed topK and thinkingBudgetTokens fields in BedrockConfig with a single additionalModelRequestFields raw JSON field. This avoids growing the CRD with every new Claude-specific parameter and keeps the config close to how Bedrock models the API itself. CRD type uses *apiextensionsv1.JSON with +kubebuilder:pruning:PreserveUnknownFields so unknown nested keys are not pruned at admission time. The translator unmarshals the raw bytes into map[string]any before passing them to the Go ADK BedrockConfig. The Python ADK receives the same field as a plain dict and forwards it directly to the Converse API. Example ModelConfig usage: bedrock: region: us-east-1 additionalModelRequestFields: | {top_k: 5, thinking: {type: enabled, budget_tokens: 16000}} CRD manifests and Helm template regenerated via make manifests. Signed-off-by: mesutoezdil <mesudozdil@gmail.com>

mesutoezdil · 2026-04-24T17:16:26Z

Thx @supreme-gg-gg. Done. topK and thinkingBudgetTokens are gone. BedrockConfig now has a single additionalModelRequestFields *apiextensionsv1.JSON field with +kubebuilder:pruning:PreserveUnknownFields. The translator unmarshals the raw bytes into map[string]any and passes it straight to the Converse API. Python side gets the same field as a plain dict. CRD manifests and Helm template regenerated.

supreme-gg-gg

The changes look great, before we can merge this, have you been able to test this with cache point config / thinking config: https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html#prompt-caching-get-started

It seems like the original issue discusses using prompt caching and it would be great if we can verify this does improve the cost usage with these thinking / caching config

mesutoezdil · 2026-04-25T20:35:34Z

I do not have a Bedrock account to verify the cost reduction end-to-end against the live API.

What this PR is verified to do: the unit tests assert that whatever JSON is set on additionalModelRequestFields lands in the kwargs passed to boto3.converse / converse_stream verbatim. So a config with thinking enabled (e.g. {"thinking": {"type": "enabled", "budget_tokens": 16000}}) reaches Bedrock unchanged. That is the contract this PR adds.

Two things worth separating:

Thinking config: this PR covers it. The raw JSON pass-through is exactly how Bedrock expects extended thinking to be configured per the docs.
Prompt caching (cachePoint): this PR does NOT cover it. cachePoint blocks live inside the message array, not in additionalModelRequestFields, so they need a separate change in _convert_content_to_converse_messages. The original issue mentions cost concerns broadly but the proposed config (effort, thinking) is parameter-level rather than cache-point-level. Happy to follow up with a dedicated caching PR once this lands.

Would it be reasonable to merge this PR for the thinking/parameter use case and track caching separately, or do you want both in one PR? If the latter, would you have a way to test it on real Bedrock since I cannot?

supreme-gg-gg

I am able to test this with bedrock claude, seems to be working. Thanks @mesutoezdil

Copilot AI review requested due to automatic review settings April 23, 2026 11:09

mesutoezdil requested review from EItanya, ilackarms, peterj and yuval-k as code owners April 23, 2026 11:09

github-actions Bot added the enhancement New feature or request label Apr 23, 2026

Copilot started reviewing on behalf of mesutoezdil April 23, 2026 11:09 View session

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Apr 23, 2026

Copilot AI reviewed Apr 23, 2026

View reviewed changes

supreme-gg-gg reviewed Apr 23, 2026

View reviewed changes

supreme-gg-gg self-assigned this Apr 23, 2026

mesutoezdil requested review from iplay88keys and jmhbh as code owners April 24, 2026 17:16

This was referenced Apr 24, 2026

Apply to become a kagent organization member - mesutoezdil kagent-dev/community#14

Open

Apply to become a kagent organization member - mesutoezdil (moved to kagent-dev/community#14) mesutoezdil/community#1

Closed

supreme-gg-gg reviewed Apr 25, 2026

View reviewed changes

Merge branch 'main' into feat/bedrock-claude-thinking

28a679d

supreme-gg-gg approved these changes Apr 27, 2026

View reviewed changes

	Must be at least 1024 when set.
	Must be at least 1024 when set.
	minimum: 1024

Conversation

mesutoezdil commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

supreme-gg-gg left a comment

Choose a reason for hiding this comment

Uh oh!

mesutoezdil commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

supreme-gg-gg commented Apr 24, 2026

Uh oh!

mesutoezdil commented Apr 24, 2026

Uh oh!

mesutoezdil commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

supreme-gg-gg left a comment

Choose a reason for hiding this comment

Uh oh!

mesutoezdil commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

supreme-gg-gg left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mesutoezdil commented Apr 23, 2026 •

edited

Loading

mesutoezdil commented Apr 24, 2026 •

edited

Loading

mesutoezdil commented Apr 24, 2026 •

edited

Loading

mesutoezdil commented Apr 25, 2026 •

edited

Loading