Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions tools/session-token-scan/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Session Token Scan

`tools/session-token-scan/scan.py` is a deterministic, read-only scanner for
Every Code rollout/session files. It highlights token-efficiency suspects before
prompt, memory, agent, or history behavior is changed.

The scanner does not call an LLM. Optional local or cheaper model classifiers
should be layered on later and should consume only the compact scanner output or
narrowed suspect spans, not whole raw rollout files.

## Run

```sh
python3 tools/session-token-scan/scan.py ~/.code/sessions/2026/05/11 --limit 20
```

Useful options:

- `--limit 0`: scan all discovered rollout files under the inputs.
- `--json`: emit machine-readable output for PR evidence or later classifiers.
- `--usage-root ~/.code/usage`: correlate timestamped usage entries when
available.
- `--large-threshold 16384`: set the byte threshold for large-record suspects.

## Reports

The text report includes:

- peak cumulative and largest single-turn token usage
- cache ratio from recorded token counts
- token total reset counts when cumulative counters restart inside a rollout
- rollout file size, image payload bytes, and base-instruction size
- duplicated project-doc/skill injection flags
- largest token-count events
- largest persisted records and string fields
- `data:image` and `input_image` suspects
- optional usage-entry totals that overlap the session timestamp range

## Scope

This tool reads local files only. It does not mutate session history, enforce
budgets, change model routing, or compact stored payloads.
Loading