Skip to content

fix(pricing): April 2026 drift — Opus 3× over-billing + 11 missing models#336

Merged
Destynova2 merged 1 commit into
mainfrom
fix/pricing-april-2026-drift
May 15, 2026
Merged

fix(pricing): April 2026 drift — Opus 3× over-billing + 11 missing models#336
Destynova2 merged 1 commit into
mainfrom
fix/pricing-april-2026-drift

Conversation

@Destynova2
Copy link
Copy Markdown
Contributor

Summary

Static pricing table reflected pre-2026 rates. This caused silent 3× over-billing on Anthropic Opus (4.6 onwards), 2× over-billing on MiniMax M2.5, 2× under-billing on Gemini 2.5 Pro output, and complete fallback miss for models without a row (GPT-5/5.5, DeepSeek V4, Grok 4.1, Mercury 2 — pricing() returned None, breaking spend tracking entirely).

Critical corrections

Model Was Now Impact
claude-opus-4-6 $15/$75 $5/$25 3× over
claude-opus-4-7 (missing) $5/$25 silent fallback
claude-haiku-4-5 $0.80/$4 $1/$5 25% under
MiniMax-M2.5 $0.30/$1.20 $0.15/$0.95 2× over
gemini-2.5-pro $1.25/$5 $1.25/$10 2× under (output)
deepseek-chat/reasoner $0.27/$1.10 $0.14/$0.28 legacy rerouted to V4-Flash

New entries (were returning None)

  • OpenAI: gpt-5, gpt-5.5, gpt-5.5-pro, o1/o1-mini, o3/o3-mini, gpt-oss-20b/120b
  • DeepSeek: deepseek-v4-flash, deepseek-v4-pro
  • Z.ai: glm-4.5-flash
  • MiniMax: M2.7
  • Llama: llama-4-scout-17b (Groq free tier)
  • Inception: mercury-2 ($0.25/$0.75, supersedes mercury-coder-small)
  • xAI: grok-4.1-fast
  • Anthropic: claude-sonnet-4-7

Deferred

Reasoning-token 3× surcharge for o-series models requires a CostInputs struct and is deferred to a follow-up PR that also wires cache_read/cache_creation multipliers (Anthropic prompt cache revenue leak).

Test plan

  • cargo test --lib pricing — 17 new test cases for corrections + new entries
  • cargo test --lib --tests — 263 passed, 0 failed
  • cargo test --doc pricing — 2 doctests pass
  • cargo test --test lib enterprise:: — 87 passed
  • cargo clippy --lib -- -D warnings — clean

🤖 Generated with Claude Code

…DeepSeek-V4

Static pricing table reflected pre-2026 rates. This caused silent
over-billing (Opus 4.6/4.7 at $15/$75 instead of $5/$25 since 4.6
shipped), under-billing (Gemini 2.5 Pro output at $5/M instead of $10/M),
and complete fallback miss for models with no row at all (GPT-5/5.5,
DeepSeek V4, Grok 4.1, Mercury 2).

Critical corrections:
- claude-opus-4-7 (added at $5/$25)
- claude-opus-4-6: $15/$75 → $5/$25
- claude-haiku-4-5: $0.80/$4 → $1/$5
- MiniMax-M2.5: $0.30/$1.20 → $0.15/$0.95
- gemini-2.5-pro: $1.25/$5 → $1.25/$10
- deepseek-chat/reasoner: rerouted to V4-Flash rates ($0.14/$0.28)

New entries:
- OpenAI: gpt-5, gpt-5.5, gpt-5.5-pro, o1/o1-mini, o3/o3-mini, gpt-oss-20b/120b
- DeepSeek: deepseek-v4-flash, deepseek-v4-pro
- Z.ai: glm-4.5-flash
- MiniMax: M2.7
- Llama: llama-4-scout-17b (Groq free)
- Inception: mercury-2 ($0.25/$0.75, supersedes mercury-coder-small)
- xAI: grok-4.1-fast
- Anthropic: claude-sonnet-4-7

Reasoning-token 3× surcharge (o-series) deferred to follow-up PR
introducing CostInputs + cache_read/creation multipliers.

Test asserts and one upstream test in features::token_pricing updated
to reflect the corrected Opus rates.
@Destynova2 Destynova2 enabled auto-merge May 15, 2026 09:29
@Destynova2 Destynova2 merged commit ea1665d into main May 15, 2026
43 checks passed
@Destynova2 Destynova2 deleted the fix/pricing-april-2026-drift branch May 15, 2026 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant