feat(bedrock): add top_k and extended thinking support for Claude models#1732
feat(bedrock): add top_k and extended thinking support for Claude models#1732mesutoezdil wants to merge 3 commits intokagent-dev:mainfrom
Conversation
Expose top_k and thinking_budget_tokens for Bedrock Claude models so operators can enable extended thinking mode via the ModelConfig CRD. Both fields are optional and only sent to the Converse API when set, using additionalModelRequestFields which is the correct mechanism for parameters outside the standard InferenceConfiguration block. Closes kagent-dev#1721 Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR adds support for Claude-specific Bedrock Converse API parameters (top_k and extended thinking via thinking.budget_tokens) and wires them through the ModelConfig CRD → translator → Go ADK → (also) Python ADK Bedrock client so operators can enable these options end-to-end.
Changes:
- Extends the Bedrock section of the ModelConfig CRD with optional
topKandthinkingBudgetTokensfields (and regenerates/syncs CRD manifests/templates). - Updates Go Bedrock model implementation to send these Claude-only settings via
AdditionalModelRequestFields(only when set). - Updates Python
KAgentBedrockLlmand the Python ADK types factory to pass and emit the sameadditionalModelRequestFieldspayload.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| python/packages/kagent-adk/src/kagent/adk/types.py | Adds top_k / thinking_budget_tokens to the Bedrock model config type and passes them into KAgentBedrockLlm. |
| python/packages/kagent-adk/src/kagent/adk/models/_bedrock.py | Builds additionalModelRequestFields for top_k and thinking when configured. |
| helm/kagent-crds/templates/kagent.dev_modelconfigs.yaml | CRD Helm template updated to include Bedrock topK and thinkingBudgetTokens. |
| go/core/internal/controller/translator/agent/adk_api_translator.go | Translator now passes Bedrock TopK and ThinkingBudgetTokens into the ADK Bedrock model config. |
| go/api/v1alpha2/zz_generated.deepcopy.go | Deepcopy updated to properly deep-copy Bedrock pointer fields and use BedrockConfig’s DeepCopyInto. |
| go/api/v1alpha2/modelconfig_types.go | Adds TopK and ThinkingBudgetTokens fields to the BedrockConfig API type. |
| go/api/config/crd/bases/kagent.dev_modelconfigs.yaml | Generated CRD base updated with the new Bedrock properties. |
| go/api/adk/types.go | Go ADK Bedrock model type gains top_k / thinking_budget_tokens JSON fields. |
| go/adk/pkg/models/bedrock.go | Adds ThinkingBudgetTokens to config and sends Claude-specific fields via AdditionalModelRequestFields in Converse/ConverseStream. |
| go/adk/pkg/agent/agent.go | Wires ADK Bedrock model config fields into the Bedrock runtime model config. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if self.top_k is not None: | ||
| additional_fields["top_k"] = self.top_k | ||
| if self.thinking_budget_tokens is not None: |
There was a problem hiding this comment.
thinking_budget_tokens is documented as requiring a minimum (>=1024), but this code will pass through any integer to Bedrock and fail later with a generic API error. Consider validating top_k (>0) and thinking_budget_tokens (>=1024) before populating additionalModelRequestFields, and extend the existing unit tests to assert the kwargs passed to client.converse/converse_stream when these options are set.
| if self.top_k is not None: | |
| additional_fields["top_k"] = self.top_k | |
| if self.thinking_budget_tokens is not None: | |
| if self.top_k is not None: | |
| if self.top_k <= 0: | |
| raise ValueError("top_k must be greater than 0") | |
| additional_fields["top_k"] = self.top_k | |
| if self.thinking_budget_tokens is not None: | |
| if self.thinking_budget_tokens < 1024: | |
| raise ValueError("thinking_budget_tokens must be greater than or equal to 1024") |
| // ThinkingBudgetTokens sets the maximum number of tokens Claude may use for | ||
| // internal reasoning before producing its response. When set, extended thinking | ||
| // is enabled. Only supported by Claude models that support extended thinking. | ||
| // Must be at least 1024 when set. | ||
| // +optional |
There was a problem hiding this comment.
The doc comment states a hard constraint (“Must be at least 1024 when set”), but the type doesn’t enforce it via kubebuilder validation markers. Add CRD validation tags (e.g., +kubebuilder:validation:Minimum=1024 for ThinkingBudgetTokens, and a sensible minimum for TopK) so bad configs are rejected before reaching Bedrock.
| ThinkingBudgetTokens sets the maximum number of tokens Claude may use for | ||
| internal reasoning before producing its response. When set, extended thinking | ||
| is enabled. Only supported by Claude models that support extended thinking. | ||
| Must be at least 1024 when set. |
There was a problem hiding this comment.
The description says thinkingBudgetTokens must be at least 1024, but the CRD schema doesn’t enforce that constraint. Add an OpenAPI validation like minimum: 1024 (generated via kubebuilder tags or edited here) so invalid ModelConfigs are rejected at admission time.
| Must be at least 1024 when set. | |
| Must be at least 1024 when set. | |
| minimum: 1024 |
| ThinkingBudgetTokens sets the maximum number of tokens Claude may use for | ||
| internal reasoning before producing its response. When set, extended thinking | ||
| is enabled. Only supported by Claude models that support extended thinking. | ||
| Must be at least 1024 when set. |
There was a problem hiding this comment.
The description says thinkingBudgetTokens must be at least 1024, but the Helm CRD template doesn’t enforce that constraint in the schema. Add an OpenAPI validation like minimum: 1024 (ideally via kubebuilder tags and regeneration) so bad values fail fast.
| Must be at least 1024 when set. | |
| Must be at least 1024 when set. | |
| minimum: 1024 |
| // buildAdditionalModelRequestFields returns a document.Interface containing | ||
| // model-specific parameters that are not part of InferenceConfiguration. | ||
| // For Claude on Bedrock this includes top_k and thinking configuration. | ||
| // Returns nil when no extra fields are needed. | ||
| func (m *BedrockModel) buildAdditionalModelRequestFields() document.Interface { |
There was a problem hiding this comment.
New behavior is introduced here (building AdditionalModelRequestFields for top_k / thinking), but there are no unit tests asserting when this is nil vs populated and what JSON is sent to Bedrock. Please add coverage (e.g., in go/adk/pkg/models/bedrock_test.go) for: no fields set → nil, only TopK set → {top_k: ...}, only ThinkingBudgetTokens set → {thinking: {...}}, and both set.
supreme-gg-gg
left a comment
There was a problem hiding this comment.
The intention of the change makes sense to me, but I want to discuss a potentially different way to structure bedrock config CRD.
In Bedrock there are two types of model parameters, one typed struct called inferenceConfig (which we use already) and an untyped JSON called additionalModelRequestFields which you are using here.
I think to stay close to Bedrock's API design intention we should make inferenceConfig values directly configurable on BedrockConfig, and other custom provider-specific headers a raw JSON. The immediate benefit I see is that this avoids adding new fields for all the (potentially conflicting) provider params and eventually having a massive config, since everyone has their different needs when using Bedrock.
I'm suggesting something similar to how OllamaConfig uses map[string]string options to allow for arbitrary model parameters, or you can perhaps use *apiextensionsv1.JSON here.
|
cool point on the long-term maintainability concern. Typed fields would keep growing as new Claude-specific params appear. 1 thing worth noting: map[string]string would not work here bcs the thinking config is a nested object ({"type": "enabled", "budget_tokens": 16000}), not a flat string value. Would *apiextensionsv1.JSON be the right pick for additionalModelRequestFields, or do you have a different shape in mind? Happy to rework the PR along those lines once we agree on the type. |
|
|
|
Going with *apiextensionsv1.JSON then. Will add +kubebuilder:pruning:PreserveUnknownFields on the field and rework the PR to replace the two typed fields with a single additionalModelRequestFields raw JSON field. Will push shortly. |
…elds
Replace the typed topK and thinkingBudgetTokens fields in BedrockConfig
with a single additionalModelRequestFields raw JSON field. This avoids
growing the CRD with every new Claude-specific parameter and keeps the
config close to how Bedrock models the API itself.
CRD type uses *apiextensionsv1.JSON with +kubebuilder:pruning:PreserveUnknownFields
so unknown nested keys are not pruned at admission time. The translator
unmarshals the raw bytes into map[string]any before passing them to the
Go ADK BedrockConfig. The Python ADK receives the same field as a plain
dict and forwards it directly to the Converse API.
Example ModelConfig usage:
bedrock:
region: us-east-1
additionalModelRequestFields: |
{top_k: 5, thinking: {type: enabled, budget_tokens: 16000}}
CRD manifests and Helm template regenerated via make manifests.
Signed-off-by: mesutoezdil <mesudozdil@gmail.com>
|
Thx @supreme-gg-gg. Done. topK and thinkingBudgetTokens are gone. BedrockConfig now has a single additionalModelRequestFields *apiextensionsv1.JSON field with +kubebuilder:pruning:PreserveUnknownFields. The translator unmarshals the raw bytes into map[string]any and passes it straight to the Converse API. Python side gets the same field as a plain dict. CRD manifests and Helm template regenerated. |
supreme-gg-gg
left a comment
There was a problem hiding this comment.
The changes look great, before we can merge this, have you been able to test this with cache point config / thinking config: https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html#prompt-caching-get-started
It seems like the original issue discusses using prompt caching and it would be great if we can verify this does improve the cost usage with these thinking / caching config
|
I do not have a Bedrock account to verify the cost reduction end-to-end against the live API. What this PR is verified to do: the unit tests assert that whatever JSON is set on additionalModelRequestFields lands in the kwargs passed to boto3.converse / converse_stream verbatim. So a config with thinking enabled (e.g. {"thinking": {"type": "enabled", "budget_tokens": 16000}}) reaches Bedrock unchanged. That is the contract this PR adds. Two things worth separating:
Would it be reasonable to merge this PR for the thinking/parameter use case and track caching separately, or do you want both in one PR? If the latter, would you have a way to test it on real Bedrock since I cannot? |
supreme-gg-gg
left a comment
There was a problem hiding this comment.
I am able to test this with bedrock claude, seems to be working. Thanks @mesutoezdil
Closes #1721
Bedrock's Converse API keeps Claude-specific parameters separate from the standard InferenceConfiguration block. Fields like top_k and the extended thinking config have to go through additionalModelRequestFields. This PR wires those two fields end-to-end so operators can turn on Claude extended thinking via the ModelConfig CRD.
Changes:
Both fields are fully optional so existing configs are not affected.
Signed-off-by: mesutoezdil mesudozdil@gmail.com