Skip to content

fix: HuggingFace model pricing, vendor detection, cost precision (v1.9.3)#135

Merged
devonakelley merged 1 commit intomainfrom
fix/hf-pricing-vendor-detection-cost-precision
Mar 29, 2026
Merged

fix: HuggingFace model pricing, vendor detection, cost precision (v1.9.3)#135
devonakelley merged 1 commit intomainfrom
fix/hf-pricing-vendor-detection-cost-precision

Conversation

@devonakelley
Copy link
Copy Markdown
Contributor

Summary

Three fixes for $0 cost display and missing model pricing in telemetry.

Fix 1: pricing.py — Missing HuggingFace models

Added pricing entries for models actively used in the routing pipeline that were returning fallback/wrong costs:

  • meta-llama/llama-3.3-70b-instruct ($0.90/$0.90 per 1M)
  • mistralai/mixtral-8x22b-v0.1 ($1.20/$1.20 per 1M)
  • qwen/qwen2.5-72b-instruct ($0.90/$0.90 per 1M)
  • deepseek/deepseek-r1 ($0.55/$2.19 per 1M)

Also added a HuggingFace fuzzy matching block to normalize_model_name() — this block was missing entirely, meaning all HF models fell through to raw string lookup and often missed.

Fix 2: openai_instr.py — Vendor detection for HF-routed models

Models like meta-llama/Llama-3.3-70B-Instruct are called via the OpenAI-compatible API but were being attributed as vendor openai, hitting the GPT-4 pricing fallback ($30/$60 per 1M) instead of correct HF pricing.

Updated _detect_vendor() to detect Llama, Mixtral, and Qwen model names and route them to the huggingface vendor for correct cost attribution.

Fix 3: collector.py — Cost precision

Cost was stored at 6 decimal places but the dashboard was rendering small values as $0.00. Rounded cost_usd to 4 decimal places in the event payload so sub-cent costs (e.g. $0.0028) display correctly instead of showing $0.

Testing

All 72 existing tests pass. No breaking changes — all fixes are additive.

Version

Bumped to 1.9.3.

…cision

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@devonakelley devonakelley merged commit 4a940a7 into main Mar 29, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant