fix(parser): fall back to message.id when requestId is absent by estelledc · Pull Request #16 · hmenzagh/CCMeter

estelledc · 2026-05-13T07:39:31Z

Summary

Fixes ~2.6× token over-counting on Claude Code installs that point at a proxied API endpoint (corporate Bedrock gateways, third-party LLM relays). Direct-API users are unaffected.

Root cause

dedup_by_request_id groups assistant chunks by requestId and deltaizes cumulative snapshots. When requestId is absent the event falls through to the without_req path, which deduplicates by line_uuid. But Claude Code assigns a unique uuid per JSONL line — every content-block line of one streaming response gets a different uuid — so this fallback never collapses anything. Modern Claude Code also writes the same final usage snapshot on every content-block line, so an N-block response gets summed N times.

The proxy case is the trigger: corporate gateways often strip the requestId request header but preserve message.id (it lives in the response body). On a real user's archive (511 sessions, 28,581 assistant-with-usage lines, 12,619 unique message.ids) this produced inflation factor 2.265× (token-weighted: 2.635×, since longer responses have more chunks). The 2.0.0 changelog claim "totals now match what Anthropic actually billed" silently regresses to 2-3× over-count for these users.

Fix

request_id = raw.request_id.or_else(|| msg.id.clone())

Both id-spaces (req_… and msg_…) are globally unique per Anthropic API call, so substituting message.id when requestId is missing reuses the existing canonicalization machinery (deltaize, ghost-chunk extraction, canonical-stream picker, mirror dedup) without touching any of it. Direct-API users keep request_id semantics exactly as before — the fallback only kicks in when the field is None.

Verification

	tokens (sum of all events)
Pre-patch CCMeter	4,437,331,044
Post-patch CCMeter (simulated)	1,690,796,103
CCCostMonitor reference (dedup by `message.id` in its py implementation)	1,683,903,949
Delta	0.4% (residual: null-msg.id line handling differences, not algorithmic)

Tests

All 30 existing parser tests pass.
Adds 3 regression tests:
- dedups_assistant_chunks_by_message_id_when_request_id_absent — same-file streaming, identical cumulative snapshots
- mirror_dedup_uses_message_id_when_request_id_absent — parent + sub-agent files mirroring three cumulative chunks
- prefers_request_id_over_message_id_when_both_present — sanity that requestId still wins when present

Test plan

cargo test parser:: → 33 pass (30 existing + 3 new)
Real-data sanity: post-patch totals on a 511-session proxy archive match CCCostMonitor's reference within 0.4%
Reviewer to confirm direct-API behavior unchanged (covered by existing captures_request_id, deltaized_tokens_sum_to_final_snapshot_across_streams, etc.)

🤖 Generated with Claude Code

Proxied API calls (corporate Bedrock gateways, third-party LLM relays) strip the requestId header but preserve message.id. Previously, every chunk landed in the without_req path and the line_uuid dedup never fired (each JSONL line carries a unique uuid), so streaming responses with N content-block lines got summed N times. Observed inflation on a real proxy environment: ~2.6x across 28k events / 511 sessions. Post-patch totals match the byte-for-byte reference (CCCostMonitor's analyze_usage.py) within 0.4%. Both id-spaces are globally unique per Anthropic API call, so falling back to message.id preserves all the existing canonicalization guarantees (deltaize, ghost-chunk extraction, mirror dedup). Direct-API users (where requestId is present) are unaffected. Tests: 30 existing parser tests pass; adds 3 regression tests covering same-file streaming, mirror dedup, and requestId-precedence.

The previous commit fixes ~2.6x token inflation for proxied API users by falling back to message.id when requestId is absent. But existing v2 caches were populated by the buggy parser, and the high-water-mark merge would freeze those inflated values in place across upgrades. Bumping to v3 forces a one-time clean rebuild on first launch (the existing 'Cache rebuilt' banner already covers this UX).

estelledc added 2 commits May 13, 2026 15:39

estelledc mentioned this pull request May 13, 2026

fix(pricing): normalize Bedrock dot-form model names and add opus-4-7 #17

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(parser): fall back to message.id when requestId is absent#16

fix(parser): fall back to message.id when requestId is absent#16
estelledc wants to merge 2 commits into
hmenzagh:mainfrom
estelledc:fix/dedup-fallback-message-id

estelledc commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

estelledc commented May 13, 2026

Summary

Root cause

Fix

Verification

Tests

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant