Skip to content

Add token metrics across sessions, analytics, and UI#232

Open
tpn wants to merge 1 commit intowesm:mainfrom
tpn:tpn/229-token-metrics-pr
Open

Add token metrics across sessions, analytics, and UI#232
tpn wants to merge 1 commit intowesm:mainfrom
tpn:tpn/229-token-metrics-pr

Conversation

@tpn
Copy link
Copy Markdown
Contributor

@tpn tpn commented Mar 25, 2026

Summary

  • persist truthful token coverage metadata across SQLite/PostgreSQL, parser output, sync/upload, and legacy repair paths
  • add output_tokens analytics to summary/heatmap/top-sessions plus CSV export and supporting API/store types
  • surface token usage in the session/message UI with explicit missing-data placeholders instead of silent zeroes

Test Plan

  • CC=cc CXX=c++ make ensure-embed-dir && CC=cc CXX=c++ go test -tags fts5 ./internal/db ./internal/parser ./internal/server ./internal/sync -run 'Token|Analytics|Upload|Incremental' -count=1
  • CC=cc CXX=c++ TEST_PG_URL=postgres://agentsview_test:agentsview_test_password@localhost:5433/agentsview_test?sslmode=disable CGO_ENABLED=1 go test -tags 'fts5,pgtest' ./internal/postgres/... -run 'Token|Analytics|TopSessions|Heatmap|Store' -count=1
  • cd frontend && npm test -- src/lib/api/client-token-metrics.test.ts src/lib/stores/analytics.test.ts src/lib/components/layout/SessionBreadcrumb.test.ts src/lib/components/content/MessageContent.test.ts src/lib/components/content/SubagentInline.test.ts src/lib/utils/csv-export.test.ts src/lib/utils/format.test.ts src/lib/stores/messages.test.ts src/lib/stores/sessions.test.ts && npm run check

Reviewer Notes

  • RoboReview security passes were clean.
  • RoboReview code-review concerns were addressed in follow-up commits before this branch was squashed into the current topical history.
  • There is still a design-level question worth Wes weighing in on for v1 scope:
    • This PR intentionally does not add a full per-row PostgreSQL provenance/convergence model.
    • Current contract is: SQLite becomes authoritative after repair/resync; PostgreSQL token analytics are best-effort until a repaired/resynced SQLite source republishes with pg push --full.
    • Newer clients also gate the output_tokens controls on server capability rather than silently assuming older servers support them.
  • If we want stronger guarantees than that, I think that should be a follow-up feature for explicit trust/convergence signaling rather than silently expanding this PR further.

Closes #229

@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Mar 25, 2026

roborev: Combined Review (fba915d)

Summary Verdict: The PR successfully implements token metadata tracking and analytics, but requires fixes for handling unsupported metrics on legacy servers and ensuring compatibility with legacy parsers.

Medium

  • Location: frontend/src/lib/components/analytics/Heatmap.svelte:136,
    frontend/src/lib/components/analytics/TopSessions.svelte:31

    • Problem: The capability check hides the Output Tokens toggle but fails to reset analytics.metric or analytics.topMetric if they were already set to "output_tokens". If a
      user connects to a legacy server that does not support this metric, the client will continue sending metric=output_tokens, resulting in 400 Bad Request errors and formatting issues for missing fields.
    • Fix: Add logic (e.g., a reactive statement) to reset the active metric state to a supported
      default (such as "messages") when the output-token capability evaluates to false. Add a regression test for switching from a capable server response to an incapable one.
  • Location: internal/parser/types.go:336

    • Problem: The new parser-side
      coverage model (accumulateMessageTokenUsage and ParsedMessage.TokenPresence()) no longer falls back to treating non-zero numeric token counts as evidence of coverage, relying instead strictly on the new flags. Any parser that populates ContextTokens / OutputTokens without the new flags will have its
      tokens recorded, but coverage marked as false, silently excluding those sessions from badges and output_tokens analytics.
    • Fix: Preserve the old non-zero numeric fallback in the parser layer until all parsers are updated to emit authoritative Has*Tokens metadata, or exhaustively audit and update every
      token-producing parser before shipping.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@tpn
Copy link
Copy Markdown
Contributor Author

tpn commented Mar 25, 2026

@wesm my agent thinks it's best if we pause here and wait for your input re the bigger provenance trust/status model stuff, heh.

#232

I also captured the provenance/trust-boundary narrative in the PR body for Wes.

Current state:

  • tpn/229-token-metrics-pr is the cleaned, topical branch.
  • Local verification on that branch is green:
    • SQLite/backend slice passed
    • PG slice passed after bringing up the test container
    • frontend targeted suite passed
    • svelte-check has 0 errors and the same 5 unrelated pre-existing warnings
  • The PR branch is clean and pushed.

What is still unresolved:

  • The design reviewer is still pushing on the same bigger-scope question:
    explicit product behavior for stale NeedsResync() databases and PG non-convergence/capability/trust boundaries.
  • At this point, I do not think another small cleanup commit is the right move. The remaining objection is not about a bug in the shipped token feature; it is about whether this PR should also introduce a
    broader trust/status model.

My recommendation from here is:

  • stop changing code
  • let Wes weigh in on that provenance/trust-boundary question in the PR
  • if he wants that stronger model now, do it as a deliberate follow-up on top of this branch rather than keep stealth-expanding the token PR

If you want, I can next:

  1. add a PR comment explicitly summarizing the remaining design tradeoff, or
  2. switch to waiting for Wes/roborev feedback on the PR.

@tpn
Copy link
Copy Markdown
Contributor Author

tpn commented Mar 25, 2026

The intent with this feature/PR is presenting token consumption metrics as a first-class citizen within the UI. Especially now with the aggregated multi-agentsview stuff, it would be really valuable to see comprehensive token usage across all agents.

- feat: add token analytics metrics
- feat: surface token usage in the UI
- docs: add token metrics design note
@wesm wesm force-pushed the tpn/229-token-metrics-pr branch from fba915d to bb286fe Compare March 28, 2026 03:01
@tpn
Copy link
Copy Markdown
Contributor Author

tpn commented Mar 28, 2026

@wesm just FYI I've got an agent still cranking away on a "robust" solution for this... so I wouldn't put too much effort into reviewing the current state as it appears it's going to redo it pretty significantly (I could also do a separate PR for that so we can objectively compare what's going on). From what I can gather... the proper fix is more involved/nuanced.

@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Mar 28, 2026

roborev: Combined Review (bb286fe)

Verdict: One medium-severity regression should be addressed before merge; otherwise the changes look sound.

Medium

  • frontend/src/lib/components/analytics/Heatmap.svelte:133, frontend/src/lib/components/analytics/TopSessions.svelte:24
    The capability gate only hides the output_tokens UI controls and does not sanitize an already-selected persisted value. In a mixed-version setup, a client can still send metric=output_tokens to an older server and trigger 400 responses if that metric was restored from client state or chosen before capability detection completed.

    Fix: enforce the capability in the analytics store as well as the UI. When summary capability indicates token metrics are unavailable, coerce metric and topMetric back to supported defaults before issuing requests, and ignore persisted output_tokens selections until capability is confirmed.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@wesm
Copy link
Copy Markdown
Owner

wesm commented Mar 28, 2026

OK I'll hold off on this one and cut 0.17.0 once I merge your codex subagent PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add token usage metrics to agentsview

2 participants