Write Python LLMs can read. ~50% fewer tokens, 100% same functionality.
A raisin is a grape with the water removed. Same sweetness, same nutrients, same fruit β half the mass. This project does the same thing to Python: removes the water (docstrings, boilerplate, ceremonial type hints) and keeps the nutrients (logic, behavior, public API).
LLMs are trained to imitate human coding conventions β docstrings for Sphinx, type hints for IDE hover, verbose error handling for readable stack traces. None of these serve the LLM that's writing or reading the code.
When we let LLMs write natively for machine-reading, they produce programs that pass the same test suites in roughly half the tokens. Six independent experiments confirm the pattern:
| Experiment | Saved | Verification |
|---|---|---|
| Click 8.2.1 retrofit | 55.5% | 738/738 tests pass |
| Flask 3.1.3 retrofit | 62.7% | syntax + structure |
| Bottle 0.13.4 retrofit | 37.6% | WSGI pipeline |
| Greenfield TODO CLI (written twice) | 51.8% | 28/28 tests pass |
| Guide β agent β URL shortener (from scratch) | 47.2% | 20/20 tests pass |
| Guide β agent β click/formatting.py (real library) | 44.4% | 738/738 tests pass |
| π Single-file record: flask/helpers.py | 80.3% | syntax + structure |
Total: 786 tests verified. Zero regressions.
/plugin marketplace add Oldrich333/raisin
/plugin install raisin
git clone https://github.com/Oldrich333/raisin.git /tmp/raisin
mkdir -p ~/.claude/skills
cp -r /tmp/raisin/plugins/raisin/skills/raisin ~/.claude/skills/Copy plugins/raisin/skills/raisin/SKILL.md into your agent's skill directory.
The skill is a single self-contained file β no dependencies.
Use a slash command:
/halfcode # primary command
/dense # alias
/raisin # alias (brand)
Or natural language:
write this dense
minimize tokens
compress src/utils.py
no docstrings, llm-native style
To prove compression isn't "retrofit cheating," we wrote the same program twice from scratch, under two styles. Both pass the same 28-test spec.
| Style | Tokens | LOC | Tests |
|---|---|---|---|
| Normal Python (docstrings, type hints, verbose errors) | 3,022 | 437 | 28/28 β |
| LLM-native (dense from line 1) | 1,458 | 104 | 28/28 β |
| Ratio | 48.2% | 23.8% | β |
β Full greenfield experiment
Sure, we can compress code. But can someone following the guide reproduce the results? We tested it twice:
Test 1 β URL Shortener from scratch (20 tests): Gave an agent only the guide + test suite. Result: 775 tokens, 20/20 pass, 47.2% savings.
Test 2 β Click formatting.py, inside the real library (738 tests): Gave a different agent only the guide + original Click file. The agent wrote a dense version that passes all 738 of Click's own tests. Result: 1,195 tokens, 738/738 pass, 44.4% savings β within 5% of our hand-tuned reference.
β Guide validation experiment
| Tokens | LOC | |
|---|---|---|
| Original | 5,399 | 641 |
| Dense rewrite | 1,064 | 80 |
| Saved | 80.3% | 87.5% |
Flask's helpers.py is mostly small utility functions each wrapped in 30
lines of docstrings and type overloads. The dense version preserves all
public API and behavior in 80 lines.
python3 -m pip install pytest tiktoken pyyamlbash tools/run_tests.sh original # 738 passed
bash tools/run_tests.sh L1_clean # 738 passed
bash tools/run_tests.sh LK_kolmogorov # 738 passed
bash tools/run_tests.sh LK2_aggressive # 738 passedcd greenfield
TODO_IMPL=normal python3 -m pytest tests/ -q # 28 passed
TODO_IMPL=kolmogorov python3 -m pytest tests/ -q # 28 passedcd guide_validation
URLSHORT_IMPL=normal python3 -m pytest spec/ -q # 20 passed
URLSHORT_IMPL=kolmogorov python3 -m pytest spec/ -q # 20 passedpython3 tools/measure.pyCODE_COMPRESSION_GUIDE.md is an M2M (machine-to-machine) document β no prose, no chatty explanations. It contains:
- FRAMING β what to optimize, what to preserve, what to ignore
- WHAT TO REMOVE β always waste: docstrings, comments, blank lines, overload stubs, internal type hints
- WHAT TO RESTRUCTURE β the real gains: shared error handlers, validation helpers, dict dispatch, bulk attribute assignment
- NAMING RULES β short internal, clear public
- FORMATTING RULES β semicolons, one-liners, comprehensions
- VERIFICATION PROTOCOL β how to catch bugs without reverting
- CHECKLIST β grep patterns for each optimization opportunity
- GREENFIELD vs RETROFIT β different process for each
The guide is not a tutorial. It's a specification for LLM agents. The skill
in plugins/raisin/skills/raisin/SKILL.md is the same methodology packaged
as a Claude Code / Codex skill.
raisin/
βββ README.md β you are here
βββ CODE_COMPRESSION_GUIDE.md β the methodology (agent instructions)
βββ RESULTS.md β detailed retrofit results
βββ TECHNIQUES.md β design document
βββ READABILITY_ARGUMENT.md β pre-emptive response to "but it's unreadable"
β
βββ .claude-plugin/
β βββ marketplace.json β Claude Code marketplace definition
β
βββ plugins/raisin/ β the installable skill
β βββ .claude-plugin/plugin.json
β βββ README.md
β βββ skills/raisin/SKILL.md β methodology as agent instruction
β
βββ assets/
β βββ raisin-banner.png β social preview
β βββ raisin-logo.png β logo with code braces
β βββ raisin-icon-simple.png β minimalist icon
β
βββ greenfield/ β write-from-scratch experiment
β βββ SPEC.md, RESULTS.md
β βββ tests/test_todo.py β 28 tests (shared)
β βββ normal/todo.py β human-imitating (437 LOC)
β βββ normal_L1/todo.py β after automated strip
β βββ normal_L2/todo.py β after cosmetic pass
β βββ kolmogorov/todo.py β LLM-native (104 LOC)
β
βββ guide_validation/ β methodology transfer test
β βββ spec/SPEC.md, spec/test_url_short.py β 20 tests
β βββ normal/url_short.py β reference human-style
β βββ kolmogorov/url_short.py β agent wrote this using only the guide
β
βββ original/ β Click 8.2.1 source
βββ L1_clean/ β Click after automated strip
βββ LK_kolmogorov/ β Click after LLM-native rewrite
βββ LK2_aggressive/ β Click after second rewrite pass
βββ LK3_agent_click/ β Click with agent's formatting.py
β
βββ flask_benchmark/ β Flask 3.1.3 + compressed versions
βββ bottle_benchmark/ β Bottle 0.13.4 + compressed versions
β
βββ tests/ β Click 8.2.1's own 738-test suite
βββ tools/ β measure/strip/run_tests/full_report
A 200K context window loaded with Click + Flask + Bottle (original) consumes 170,767 tokens β 85% of the window. The LLM-native versions consume 79,542 tokens β 40%, leaving 120K tokens free for actual thinking.
Across a 50-library research mission, this saves ~1.5M tokens and roughly $4.50 per run at Claude's API rates.
When an LLM produces docstrings, type hints, and verbose error handling, it spends real wall-clock time generating tokens that nothing will ever read. Remove those constraints and the LLM writes the program faster and cleaner.
LLM-native code is not unreadable β it's densely readable. A Python programmer reading the 104-line TODO finishes faster than the 437-line version because there's less skipping. LLMs read it trivially.
Atlas Coding Engine (ACE Protocol v15) β the production methodology this benchmark validates. Atlas (the agentic AI platform that produced this benchmark) has 47,472 LOC in 163 shard files, built LLM-native from day one and claiming ~60% LOC savings β verified here.
- Repo analysis, methodology, tooling, skill: MIT
- Click, Flask, Bottle and derivative works: original licenses (BSD-3, BSD-3, MIT)
- See LICENSE for details
@misc{raisin-2026,
author = {Oldrich333},
title = {raisin: Write Python LLMs can read},
year = {2026},
url = {https://github.com/Oldrich333/raisin}
}
