Replies: 1 comment
-
|
@ahundt |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The "120x fewer tokens" claim comes from a controlled benchmark. Here's the methodology so you can verify it yourself.
Setup: 5 structural questions about a real codebase (function lookup, call tracing, dead code, route listing, architecture overview). Each question asked twice — once via codebase-memory-mcp graph queries, once via a Claude Code Explorer agent that uses grep/Glob/Read tools.
Measurement: Total input + output tokens consumed by all tool calls to answer each question.
Results:
The Explorer agent has to: read file listings → grep for patterns → read matching files → parse the output → grep again for related files → read those. Each step is a tool call with full file contents in the response.
The graph query returns exactly the structural information in one call. No file contents, no noise, no irrelevant matches.
Why it matters beyond fitting in the context window: Cost ($3-15/M tokens adds up), latency (seconds of file reading vs <1ms graph query), and accuracy (LLMs lose track of details in large contexts).
Full benchmark data: See BENCHMARK_REPORT.md and the Performance section in the README.
Beta Was this translation helpful? Give feedback.
All reactions