Add token metrics across sessions, analytics, and UI by tpn · Pull Request #232 · wesm/agentsview

tpn · 2026-03-25T20:44:27Z

Summary

persist truthful token coverage metadata across SQLite/PostgreSQL, parser output, sync/upload, and legacy repair paths
add output_tokens analytics to summary/heatmap/top-sessions plus CSV export and supporting API/store types
surface token usage in the session/message UI with explicit missing-data placeholders instead of silent zeroes

Test Plan

CC=cc CXX=c++ make ensure-embed-dir && CC=cc CXX=c++ go test -tags fts5 ./internal/db ./internal/parser ./internal/server ./internal/sync -run 'Token|Analytics|Upload|Incremental' -count=1
CC=cc CXX=c++ TEST_PG_URL=postgres://agentsview_test:agentsview_test_password@localhost:5433/agentsview_test?sslmode=disable CGO_ENABLED=1 go test -tags 'fts5,pgtest' ./internal/postgres/... -run 'Token|Analytics|TopSessions|Heatmap|Store' -count=1
cd frontend && npm test -- src/lib/api/client-token-metrics.test.ts src/lib/stores/analytics.test.ts src/lib/components/layout/SessionBreadcrumb.test.ts src/lib/components/content/MessageContent.test.ts src/lib/components/content/SubagentInline.test.ts src/lib/utils/csv-export.test.ts src/lib/utils/format.test.ts src/lib/stores/messages.test.ts src/lib/stores/sessions.test.ts && npm run check

Reviewer Notes

RoboReview security passes were clean.
RoboReview code-review concerns were addressed in follow-up commits before this branch was squashed into the current topical history.
There is still a design-level question worth Wes weighing in on for v1 scope:
- This PR intentionally does not add a full per-row PostgreSQL provenance/convergence model.
- Current contract is: SQLite becomes authoritative after repair/resync; PostgreSQL token analytics are best-effort until a repaired/resynced SQLite source republishes with pg push --full.
- Newer clients also gate the output_tokens controls on server capability rather than silently assuming older servers support them.
If we want stronger guarantees than that, I think that should be a follow-up feature for explicit trust/convergence signaling rather than silently expanding this PR further.

Closes #229

roborev-ci · 2026-03-25T20:55:45Z

roborev: Combined Review (`fba915d`)

Summary Verdict: The PR successfully implements token metadata tracking and analytics, but requires fixes for handling unsupported metrics on legacy servers and ensuring compatibility with legacy parsers.

Medium

Location: frontend/src/lib/components/analytics/Heatmap.svelte:136,
frontend/src/lib/components/analytics/TopSessions.svelte:31
- Problem: The capability check hides the Output Tokens toggle but fails to reset analytics.metric or analytics.topMetric if they were already set to "output_tokens". If a
  user connects to a legacy server that does not support this metric, the client will continue sending metric=output_tokens, resulting in 400 Bad Request errors and formatting issues for missing fields.
- Fix: Add logic (e.g., a reactive statement) to reset the active metric state to a supported
  default (such as "messages") when the output-token capability evaluates to false. Add a regression test for switching from a capable server response to an incapable one.
Location: internal/parser/types.go:336
- Problem: The new parser-side
  coverage model (accumulateMessageTokenUsage and ParsedMessage.TokenPresence()) no longer falls back to treating non-zero numeric token counts as evidence of coverage, relying instead strictly on the new flags. Any parser that populates ContextTokens / OutputTokens without the new flags will have its
  tokens recorded, but coverage marked as false, silently excluding those sessions from badges and output_tokens analytics.
- Fix: Preserve the old non-zero numeric fallback in the parser layer until all parsers are updated to emit authoritative Has*Tokens metadata, or exhaustively audit and update every
  token-producing parser before shipping.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

tpn · 2026-03-25T22:10:24Z

@wesm my agent thinks it's best if we pause here and wait for your input re the bigger provenance trust/status model stuff, heh.

#232

I also captured the provenance/trust-boundary narrative in the PR body for Wes.

Current state:

tpn/229-token-metrics-pr is the cleaned, topical branch.
Local verification on that branch is green:
- SQLite/backend slice passed
- PG slice passed after bringing up the test container
- frontend targeted suite passed
- svelte-check has 0 errors and the same 5 unrelated pre-existing warnings
The PR branch is clean and pushed.

What is still unresolved:

The design reviewer is still pushing on the same bigger-scope question:
explicit product behavior for stale NeedsResync() databases and PG non-convergence/capability/trust boundaries.
At this point, I do not think another small cleanup commit is the right move. The remaining objection is not about a bug in the shipped token feature; it is about whether this PR should also introduce a
broader trust/status model.

My recommendation from here is:

stop changing code
let Wes weigh in on that provenance/trust-boundary question in the PR
if he wants that stronger model now, do it as a deliberate follow-up on top of this branch rather than keep stealth-expanding the token PR

If you want, I can next:

add a PR comment explicitly summarizing the remaining design tradeoff, or
switch to waiting for Wes/roborev feedback on the PR.

tpn · 2026-03-25T22:11:48Z

The intent with this feature/PR is presenting token consumption metrics as a first-class citizen within the UI. Especially now with the aggregated multi-agentsview stuff, it would be really valuable to see comprehensive token usage across all agents.

- feat: add token analytics metrics - feat: surface token usage in the UI - docs: add token metrics design note

tpn · 2026-03-28T03:04:53Z

@wesm just FYI I've got an agent still cranking away on a "robust" solution for this... so I wouldn't put too much effort into reviewing the current state as it appears it's going to redo it pretty significantly (I could also do a separate PR for that so we can objectively compare what's going on). From what I can gather... the proper fix is more involved/nuanced.

roborev-ci · 2026-03-28T03:07:57Z

roborev: Combined Review (`bb286fe`)

Verdict: One medium-severity regression should be addressed before merge; otherwise the changes look sound.

Medium

frontend/src/lib/components/analytics/Heatmap.svelte:133, frontend/src/lib/components/analytics/TopSessions.svelte:24
The capability gate only hides the output_tokens UI controls and does not sanitize an already-selected persisted value. In a mixed-version setup, a client can still send metric=output_tokens to an older server and trigger 400 responses if that metric was restored from client state or chosen before capability detection completed.

Fix: enforce the capability in the analytics store as well as the UI. When summary capability indicates token metrics are unavailable, coerce metric and topMetric back to supported defaults before issuing requests, and ignore persisted output_tokens selections until capability is confirmed.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

wesm · 2026-03-28T18:11:36Z

OK I'll hold off on this one and cut 0.17.0 once I merge your codex subagent PR

feat: persist token coverage metadata

bb286fe

- feat: add token analytics metrics - feat: surface token usage in the UI - docs: add token metrics design note

wesm force-pushed the tpn/229-token-metrics-pr branch from fba915d to bb286fe Compare March 28, 2026 03:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add token metrics across sessions, analytics, and UI#232

Add token metrics across sessions, analytics, and UI#232
tpn wants to merge 1 commit intowesm:mainfrom
tpn:tpn/229-token-metrics-pr

tpn commented Mar 25, 2026

Uh oh!

roborev-ci bot commented Mar 25, 2026

Uh oh!

tpn commented Mar 25, 2026

Uh oh!

tpn commented Mar 25, 2026

Uh oh!

tpn commented Mar 28, 2026

Uh oh!

roborev-ci bot commented Mar 28, 2026

Uh oh!

wesm commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tpn commented Mar 25, 2026

Summary

Test Plan

Reviewer Notes

Uh oh!

roborev-ci bot commented Mar 25, 2026

roborev: Combined Review (fba915d)

Medium

Uh oh!

tpn commented Mar 25, 2026

Uh oh!

tpn commented Mar 25, 2026

Uh oh!

tpn commented Mar 28, 2026

Uh oh!

roborev-ci bot commented Mar 28, 2026

roborev: Combined Review (bb286fe)

Medium

Uh oh!

wesm commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

roborev: Combined Review (`fba915d`)

roborev: Combined Review (`bb286fe`)