feat(ai): per-provider max_tokens defaults with optional override by kirangadhave · Pull Request #9703 · marimo-team/marimo

kirangadhave · 2026-05-27T22:16:05Z

Summary

Drop the blanket DEFAULT_MAX_TOKENS = 4096 so reasoning-heavy models (Kimi K2.6, DeepSeek, gpt-oss, etc.) stop truncating or hard-erroring at 4096.

Non-Anthropic providers omit max_tokens from settings entirely when unset, letting pydantic-ai fall through to each provider's own default.
Anthropic provider call keeps a floor of 32768 because its API requires the field.
Adds an optional ai.max_tokens config field plus a Settings > AI checkbox that, when on, applies one explicit limit to every provider (including Anthropic).

Kimi K2.6 working with ~27k chars (roughly 6-7k tokens)

After UI override to 100 max_tokens, reasoning errors out at 550 chars

Behavior

Override	Anthropic	Other providers
Off (default)	server sends 32768	`max_tokens` omitted; provider default applies
On, value `N`	`N`	`N`

Closes MO-6314

vercel · 2026-05-27T22:16:10Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marimo-docs	Ready	Preview, Comment	May 29, 2026 5:17pm

cubic-dev-ai

1 issue found across 12 files

Architecture diagram

sequenceDiagram
    participant UI as Settings UI (ai-config.tsx)
    participant Store as Jotai Store (resolvedMarimoConfigAtom)
    participant CompletionUtils as completion-utils.ts
    participant API as AI Endpoint (ai.py)
    participant Config as ai/config.py
    participant Provider as Provider (providers.py)
    participant Model as pydantic-ai Model

    Note over UI,Model: Per-provider max_tokens defaults with optional override

    UI->>UI: User toggles checkbox or changes input
    UI->>Store: NEW: Update config.ai.max_tokens

    CompletionUtils->>Store: NEW: Read config.ai.max_tokens
    alt Override is set (number)
        CompletionUtils-->>API: NEW: Include maxTokens in request body
    else Override is unset
        CompletionUtils-->>API: NEW: Omit maxTokens from request body
    end

    API->>Config: Calls get_max_tokens(config)
    Config->>Config: Returns None when unconfigured, or configured value

    alt Body includes explicit max_tokens
        API->>API: Use body.max_tokens
    else Body omits max_tokens
        API->>API: Use get_max_tokens(config) result (may be None)
    end

    API->>Provider: stream_completion/stream_text(max_tokens)

    alt Anthropic provider
        Provider->>Provider: create_model(max_tokens)
        alt max_tokens is None
            Provider->>Provider: Apply ANTHROPIC_DEFAULT_MAX_TOKENS (32768)
        else explicit value
            Provider->>Provider: Pass through as-is
        end
        Provider->>Model: AnthropicModel(settings={max_tokens: resolved})
    else OpenAI/Google/Bedrock provider
        Provider->>Provider: create_model(max_tokens)
        alt max_tokens is None
            Provider-->>Model: Omit settings entirely (provider default applies)
        else explicit value
            Provider->>Model: Pass {max_tokens: N} in settings
        end
        Provider->>Model: (settings={max_tokens: N} or empty)
    end

    Model-->>Provider: Returns model instance
    Provider-->>API: Returns streaming response

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

Copilot

Pull request overview

This PR removes the global DEFAULT_MAX_TOKENS=4096 behavior so providers can use their own defaults when max_tokens is unset, while keeping Anthropic’s required max_tokens field via a 32,768 default floor and adding a user-configurable override.

Changes:

Backend: get_max_tokens() now returns None when unset; providers omit max_tokens in model settings unless explicitly configured (Anthropic applies a 32,768 default).
API/OpenAPI: adds optional maxTokens to completion/chat request schemas and server request models.
Config/UI: adds ai.max_tokens to user config schema and a Settings UI control that can set or clear the override (with “None-as-delete” support in config persistence).

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/_server/ai/test_providers.py	Adds tests for Anthropic floor behavior and omission/pass-through behavior when `max_tokens` is unset/set.
tests/_server/ai/test_ai_config.py	Updates `get_max_tokens` tests to expect `None` when unset.
tests/_config/test_manager.py	Adds test coverage for “None-as-delete” config save semantics.
packages/openapi/src/api.ts	Regenerated TS OpenAPI types to include optional `maxTokens`.
packages/openapi/api.yaml	Adds `maxTokens` to completion/chat request schemas.
marimo/_server/models/models.py	Documents “None-as-delete” semantics for user config saves.
marimo/_server/models/completion.py	Adds `max_tokens` to completion and chat request structs.
marimo/_server/api/endpoints/ai.py	Prefers request `max_tokens` when provided, otherwise falls back to config.
marimo/_server/ai/providers.py	Updates provider model/agent creation to accept optional `max_tokens` and omit settings when unset (Anthropic uses default floor).
marimo/_server/ai/constants.py	Replaces `DEFAULT_MAX_TOKENS` with `ANTHROPIC_DEFAULT_MAX_TOKENS`.
marimo/_server/ai/config.py	`get_max_tokens()` now returns `int
marimo/_config/manager.py	Implements recursive dropping of `None` values before writing config to disk.
marimo/_config/config.py	Minor signature formatting change.
frontend/src/core/config/config-schema.ts	Adds `ai.max_tokens` as nullable/optional in user config schema.
frontend/src/components/editor/ai/completion-utils.ts	Includes `maxTokens` in completion request bodies when configured.
frontend/src/components/app-config/user-config-form.tsx	Adjusts form debounce and onChange submit wiring.
frontend/src/components/app-config/ai-config.tsx	Adds UI controls (checkbox + numeric input) to set/clear `ai.max_tokens`.

kirangadhave · 2026-05-28T21:41:05Z

@cubic-dev-ai

cubic-dev-ai · 2026-05-28T21:41:12Z

@cubic-dev-ai

@kirangadhave I have started the AI code review. It will take a few minutes to complete.

Copilot

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.

cubic-dev-ai

4 issues found across 18 files

Architecture diagram

sequenceDiagram
    participant UI as User Config UI
    participant RHF as React Hook Form
    participant ConfigAPI as Config Save API
    participant ConfigMgr as Config Manager
    participant TOML as TOML Disk Store
    participant CompletionUI as AI Completion UI
    participant ResolvedConfig as Resolved Config Atom
    participant CompletionReq as AI Completion/Chat Request
    participant AIEndpoint as AI API Endpoint (/completion | /chat)
    participant Provider as Provider (Anthropic/OpenAI/etc.)
    participant PydanticAI as Pydantic-AI Model

    Note over UI,TOML: Configuration Save Flow

    UI->>RHF: checkbox toggle (on/off)
    RHF->>RHF: setValue("ai.max_tokens", 32768 or null)
    RHF->>ConfigAPI: saveConfig({ ai: { max_tokens: N | null } })
    ConfigAPI->>ConfigMgr: save_config(partial config)
    ConfigMgr->>ConfigMgr: merge with current, call _drop_none_values()
    alt max_tokens is null
        ConfigMgr->>TOML: write config WITHOUT "max_tokens" key
    else max_tokens is a number
        ConfigMgr->>TOML: write config WITH "max_tokens": N
    end
    TOML-->>ConfigMgr: success
    ConfigMgr-->>ConfigAPI: ok
    ConfigAPI-->>UI: config saved

    Note over CompletionUI,PydanticAI: AI Completion/Chat Request Flow

    CompletionUI->>ResolvedConfig: read resolvedMarimoConfigAtom
    ResolvedConfig-->>CompletionUI: { ai: { max_tokens: N | undefined } }
    CompletionUI->>CompletionReq: POST /api/ai/completion | /chat with maxTokens (if set)
    CompletionReq->>AIEndpoint: request with body.maxTokens or undefined
    AIEndpoint->>AIEndpoint: resolve: body.maxTokens ?? config.max_tokens ?? None
    alt Non-Anthropic provider
        AIEndpoint->>Provider: stream_completion/stream_text(max_tokens=None)
        Provider->>PydanticAI: create_model(max_tokens=None)
        PydanticAI-->>Provider: model with settings omitting max_tokens
        Provider->>PydanticAI: stream using provider default
    else Anthropic provider
        AIEndpoint->>Provider: stream_completion/stream_text(max_tokens=N | 32768)
        Provider->>Provider: if max_tokens is None, use ANTHROPIC_DEFAULT_MAX_TOKENS
        Provider->>PydanticAI: create_model(max_tokens=32768 or N)
        PydanticAI-->>Provider: model with explicit max_tokens
        Provider->>PydanticAI: stream with max_tokens applied
    end
    PydanticAI-->>Provider: stream response
    Provider-->>AIEndpoint: streamed text
    AIEndpoint-->>CompletionReq: streaming response
    CompletionReq-->>CompletionUI: display result

    Note over CompletionReq,AIEndpoint: Error Path - Anthropic Without Override
    alt No override, no config, Anthropic provider
        CompletionReq->>AIEndpoint: body.maxTokens undefined, config max_tokens undefined
        AIEndpoint->>Provider: max_tokens=None
        Provider->>Provider: set to ANTHROPIC_DEFAULT_MAX_TOKENS (32768)
        Provider->>PydanticAI: create_model(max_tokens=32768)
        PydanticAI-->>Provider: model with max_tokens=32768
        Note over Provider: succeeds (required field satisfied)
    end

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

dmadisetti

Just some comments on language. Think it's a little awkward right now.

Looks like a few clean up things snuck into there, but nice!

cubic-dev-ai

1 issue found across 1 file (changes from recent commits).

_{Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic}

Light2Dark

Is there a issue this is attached too? I'm wondering if we should avoid adding UI config, just keep it in the toml.

If not, imo the UI is abit verbose. I would just say "Max tokens" and the description about overriding. You should also use the component to check if it has been overrided by the backend.

Light2Dark · 2026-05-29T04:18:25Z

+        # None-as-delete: any key whose merged value is None (typically because
+        # the incoming config explicitly sent null) is removed from disk. Lets
+        # the UI clear optional scalars (e.g. ai.max_tokens) without a separate
+        # delete primitive.
+        _drop_none_values(cast(dict[str, Any], merged))


We should note this for bug bash. We had issues in the past where config changes on the frontend didn't save, this whole function is a bit fragile.

I'll add it to notes for next release

do you think we need more tests around this?

Maybe..

The previous issues was if you add a custom model, then refresh, it wouldn't be saved.
Same if you ticked certain models in the AI models dropdown, and then refreshed, it would disappear.
Same with deleting a custom model.

I would test ^ during bug bash

kirangadhave · 2026-05-29T05:52:00Z

Is there a issue this is attached too? I'm wondering if we should avoid adding UI config, just keep it in the toml.

I feel there is value in UI config. Just so one can override it if they are using local models like Ollama with much lower limits. (i assume we can just add a local provider with url?)

If not, imo the UI is abit verbose. I would just say "Max tokens" and the description about overriding. You should also use the component to check if it has been overrided by the backend.
How would backend overide it? Only source of truth is toml. Only thing backend has is a default value for anthropic provider

Do you suggest a specific copy for the text?

kirangadhave · 2026-05-29T05:54:44Z

Is there a issue this is attached too?

Only a linear issue. ID mentioned in PR desc

Light2Dark · 2026-05-29T06:08:19Z

Do you suggest a specific copy for the text?

Max tokens: [ disabled if override is unticked ] [ ] Override
We recommend leaving this off as each provider applies their own limit, except for Anthropic where we set a default. Use this to control costs or raise limits for heavy-reasoning models.

maybe something like this?

(i assume we can just add a local provider with url?)

yeap, thats possible

Adds an optional ai.max_tokens setting in Settings > AI that overrides every provider's default. Anthropic gets a 32768 floor when unset; sending null removes the key from disk.

Consolidate the two descriptions into a single recommendation per review feedback.

kirangadhave · 2026-05-29T06:18:44Z

Fixed copy:

Reads much better now. Thanks @Light2Dark

Light2Dark · 2026-05-29T08:49:27Z

Fixed copy

i added a PR here #9720 for the UI, feel free to use it or leave it.

I had an issue where my config wasn't being saved on refresh. So it keeps showing this

cubic-dev-ai

1 issue found across 1 file (changes from recent commits).

_{Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic}

Track the override-enabled state locally instead of deriving it from the field value, so clearing the input no longer disables it and unchecks the Override checkbox mid-edit.

Light2Dark

thanks for fixing!

github-actions Bot added the bash-focus Area to focus on during release bug bash label May 27, 2026

cubic-dev-ai Bot reviewed May 27, 2026

View reviewed changes

Comment thread packages/openapi/api.yaml Outdated

kirangadhave added the bug Something isn't working label May 27, 2026

kirangadhave force-pushed the kg/ai-max-tokens-override branch from 42eb503 to e28fe6f Compare May 28, 2026 01:38

vercel Bot deployed to Preview May 28, 2026 01:39 View deployment

vercel Bot deployed to Preview May 28, 2026 01:40 View deployment

kirangadhave force-pushed the kg/ai-max-tokens-override branch from 02ae3de to 3ab92a1 Compare May 28, 2026 21:03

kirangadhave marked this pull request as ready for review May 28, 2026 21:04

Copilot AI review requested due to automatic review settings May 28, 2026 21:04

Copilot started reviewing on behalf of kirangadhave May 28, 2026 21:04 View session

vercel Bot deployed to Preview May 28, 2026 21:04 View deployment

Copilot AI reviewed May 28, 2026

View reviewed changes

Comment thread marimo/_server/api/endpoints/ai.py Outdated

Comment thread frontend/src/components/app-config/user-config-form.tsx Outdated

vercel Bot deployed to Preview May 28, 2026 21:22 View deployment

kirangadhave marked this pull request as draft May 28, 2026 21:23

kirangadhave marked this pull request as ready for review May 28, 2026 21:38

kirangadhave requested review from Light2Dark, Copilot and dmadisetti May 28, 2026 21:39

Copilot started reviewing on behalf of kirangadhave May 28, 2026 21:41 View session

Copilot AI reviewed May 28, 2026

View reviewed changes

Comment thread frontend/src/components/app-config/ai-config.tsx Outdated

Comment thread frontend/src/components/app-config/ai-config.tsx

Comment thread frontend/src/components/editor/ai/completion-utils.ts Outdated

Comment thread frontend/src/components/editor/ai/completion-utils.ts Outdated

cubic-dev-ai Bot reviewed May 28, 2026

View reviewed changes

Comment thread frontend/src/components/app-config/ai-config.tsx Outdated

Comment thread marimo/_server/models/completion.py Outdated

Comment thread frontend/src/components/editor/ai/completion-utils.ts Outdated

Comment thread frontend/src/components/app-config/ai-config.tsx Outdated

dmadisetti reviewed May 28, 2026

View reviewed changes

Comment thread frontend/src/components/app-config/ai-config.tsx Outdated

Comment thread frontend/src/components/app-config/ai-config.tsx Outdated

Comment thread marimo/_config/manager.py

kirangadhave commented May 29, 2026

View reviewed changes

Comment thread frontend/src/components/app-config/ai-config.tsx Outdated

vercel Bot deployed to Preview May 29, 2026 00:03 View deployment

cubic-dev-ai Bot reviewed May 29, 2026

View reviewed changes

Comment thread frontend/src/components/app-config/ai-config.tsx Outdated

kirangadhave requested a review from dmadisetti May 29, 2026 00:17

vercel Bot deployed to Preview May 29, 2026 00:18 View deployment

kirangadhave force-pushed the kg/ai-max-tokens-override branch from 98e0c34 to f3e473b Compare May 29, 2026 01:20

vercel Bot deployed to Preview May 29, 2026 01:21 View deployment

kirangadhave force-pushed the kg/ai-max-tokens-override branch from f3e473b to 87e0d04 Compare May 29, 2026 04:07

vercel Bot deployed to Preview May 29, 2026 04:07 View deployment

Light2Dark reviewed May 29, 2026

View reviewed changes

kirangadhave requested a review from Light2Dark May 29, 2026 05:53

kirangadhave added 3 commits May 28, 2026 23:11

feat(ai): per-provider max_tokens override with null-as-delete

4678db5

Adds an optional ai.max_tokens setting in Settings > AI that overrides every provider's default. Anthropic gets a 32768 floor when unset; sending null removes the key from disk.

docs(ai): refine max_tokens setting descriptions

8f8126a

docs(ai): simplify max_tokens override copy

925418c

Consolidate the two descriptions into a single recommendation per review feedback.

kirangadhave force-pushed the kg/ai-max-tokens-override branch from 87e0d04 to 925418c Compare May 29, 2026 06:18

vercel Bot deployed to Preview May 29, 2026 06:19 View deployment

small cleanup AI config - max tokens (#9720)

9613f17

vercel Bot deployed to Preview May 29, 2026 14:07 View deployment

cubic-dev-ai Bot reviewed May 29, 2026

View reviewed changes

Comment thread frontend/src/components/app-config/ai-config.tsx

fix(ai): keep max output tokens input editable after clearing

a925628

Track the override-enabled state locally instead of deriving it from the field value, so clearing the input no longer disables it and unchecks the Override checkbox mid-edit.

kirangadhave requested review from akshayka and removed request for Light2Dark May 29, 2026 17:17

vercel Bot deployed to Preview May 29, 2026 17:17 View deployment

kirangadhave requested a review from Light2Dark May 29, 2026 17:24

Light2Dark approved these changes May 30, 2026

View reviewed changes

kirangadhave merged commit 87446a5 into main May 30, 2026
44 checks passed

kirangadhave deleted the kg/ai-max-tokens-override branch May 30, 2026 19:11

Conversation

kirangadhave commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Behavior

Uh oh!

vercel Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

kirangadhave commented May 28, 2026

Uh oh!

cubic-dev-ai Bot commented May 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dmadisetti left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Light2Dark left a comment

Choose a reason for hiding this comment

Uh oh!

Light2Dark May 29, 2026

Choose a reason for hiding this comment

Uh oh!

kirangadhave May 29, 2026

Choose a reason for hiding this comment

Uh oh!

kirangadhave May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Light2Dark May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kirangadhave commented May 29, 2026

Uh oh!

kirangadhave commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Light2Dark commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kirangadhave commented May 29, 2026

Uh oh!

kirangadhave commented May 27, 2026 •

edited

Loading

vercel Bot commented May 27, 2026 •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading

Light2Dark May 29, 2026 •

edited

Loading

kirangadhave commented May 29, 2026 •

edited

Loading

Light2Dark commented May 29, 2026 •

edited

Loading

Light2Dark commented May 29, 2026 •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading