QuantEcon · mmcky · Jun 5, 2026 · May 25, 2026 · May 25, 2026
diff --git a/docs/developer/architecture.md b/docs/developer/architecture.md
@@ -17,7 +17,7 @@ Technical documentation for developers working on the QuantEcon Style Guide Chec
                            ▼
 ┌─────────────────────────────────────────────────────┐
 │                      action.yml                      │
-│  Sets up Python, installs deps, invokes action.py    │
+│  Installs uv, syncs deps, invokes action.py via uv   │
 └──────────────────────────┬──────────────────────────┘
                            │
               ┌────────────┴────────────┐
@@ -32,18 +32,16 @@ Technical documentation for developers working on the QuantEcon Style Guide Chec
                     ┌───────────────────┼──────────────┐
                     ▼                   ▼              ▼
           ┌──────────────┐   ┌──────────────┐  ┌─────────────┐
-          │prompt_loader  │   │ fix_applier   │  │Anthropic API│
-          │Load prompts   │   │Apply fixes    │  │(Claude)     │
-          │Load rules     │   │Validate       │  └─────────────┘
-          └──────┬───────┘   └──────────────┘
-          ┌──────┴───────┐
-          ▼              ▼
-    ┌──────────┐  ┌──────────┐
-    │prompts/  │  │ rules/   │
-    │(8 files) │  │(8 files) │
-    └──────────┘  └──────────┘
+          │ prompts/      │   │ fix_applier   │  │Anthropic API│
+          │ prompt.md     │   │Apply fixes    │  │(Claude)     │
+          │  + rules/*.md │   │Validate       │  └─────────────┘
+          └──────────────┘   └──────────────┘
 ```
 
+`categories.py` is the single source of truth for the 8 category names
+(`writing`, `math`, `code`, `jax`, `figures`, `references`, `links`,
+`admonitions`); every other module imports `VALID_CATEGORIES` from it.
+
 ## Two Entry Points, One Engine
 
 - **`action.py`**: GitHub Action entry point. Reads files via GitHub API, creates PRs with fixes.
@@ -87,15 +85,21 @@ Key classes:
 - `AnthropicProvider` — Claude API wrapper with extended thinking and streaming fallback
 - `StyleReviewer` — Main review orchestrator
 
-### Prompt Loader (`prompt_loader.py`)
+### Prompt Construction (`reviewer.create_single_rule_prompt`)
 
-Loads and combines category-specific prompts and rules:
+For each rule, the reviewer builds an LLM prompt as:
 
 ```
-[Category Prompt]  +  [Style Guide Rules]  +  [Lecture Content]  →  LLM
+[Shared base prompt (prompts/prompt.md)]
+  + [Single rule definition from rules/{category}-rules.md]
+  + [Lecture content]
+  → LLM
 ```
 
-The prompt is rule-agnostic — all 8 category prompts are identical. Scope and analysis context come from the rule definitions themselves. This prevents signal dilution from category-specific instructions.
+The base prompt is rule-agnostic — a single `prompts/prompt.md` file is
+shared across all 8 categories. Scope and analysis context come from the
+rule definitions themselves, which prevents signal dilution from
+category-specific instructions.
 
 ### Fix Applier (`fix_applier.py`)
 
@@ -228,16 +232,18 @@ Depends on lecture length and violations found.
 ```
 action-style-guide/
 ├── action.yml                 # GitHub Action definition
+├── pyproject.toml             # Package + dep manifest (uv-managed)
+├── uv.lock                    # Reproducible dep lockfile
 ├── style_checker/             # Main package
 │   ├── __init__.py            # Version (__version__)
+│   ├── categories.py          # Single source of truth for VALID_CATEGORIES
 │   ├── cli.py                 # Local CLI entry point (qestyle)
 │   ├── action.py              # GitHub Action entry point
 │   ├── reviewer.py            # LLM review engine (shared)
 │   ├── fix_applier.py         # Apply fixes to files (shared)
 │   ├── github_handler.py      # GitHub API (action only)
-│   ├── prompt_loader.py       # Load prompts + rules (shared)
-│   ├── prompts/               # Minimal rule-agnostic prompts
-│   └── rules/                 # Category-specific rule definitions
+│   ├── prompts/               # Single shared prompt.md (+ v0.6.1 archive)
+│   └── rules/                 # Per-category rule definitions
 ├── tests/                     # Test suite
 ├── docs/                      # Documentation (this site)
 └── examples/                  # Example workflows

diff --git a/docs/developer/contributing.md b/docs/developer/contributing.md
@@ -88,16 +88,15 @@ Rules are in `style_checker/rules/` and are read directly by the LLM — **no co
    [Good and bad examples]
    ```
 
-3. Update corresponding prompt file in `style_checker/prompts/` if needed
+3. The base prompt (`prompts/prompt.md`) is shared across all categories; usually no edit needed there.
 4. Test with real lecture files
 
 ### Adding a New Category
 
-1. Create `prompts/category-prompt.md`
-2. Create `rules/category-rules.md`
-3. Add category to `VALID_CATEGORIES` in `github_handler.py` and `prompt_loader.py`
-4. Add to category list in `review_lecture_smart()`
-5. Test end-to-end
+1. Create `style_checker/rules/{category}-rules.md`
+2. Add the new name to `VALID_CATEGORIES` in `style_checker/categories.py`
+3. Add an entry for it in `RULE_EVALUATION_ORDER` in `style_checker/reviewer.py` (the test suite will fail loudly if the keys drift)
+4. Test end-to-end
 
 ## Pull Request Process
 

diff --git a/docs/developer/extended-thinking.md b/docs/developer/extended-thinking.md
@@ -117,5 +117,5 @@ This is 40 lines vs the previous 120-line category-specific prompts, and it prod
 |----------|-----------|
 | `thinking_budget=10000` | Enough for careful analysis, not excessive cost |
 | `temperature=1.0` | Required by Anthropic for extended thinking |
-| 8 identical prompt files (for now) | Consolidation to single file planned (validated on writing, pending other categories) |
+| Single shared `prompts/prompt.md` | Consolidated from 8 byte-identical files (validated on writing, then rolled out across all categories) |
 | Archive v0.6.1 prompts | Reference for regression testing and comparison |
diff --git a/docs/developer/roadmap.md b/docs/developer/roadmap.md
@@ -28,7 +28,7 @@ The project is in **active development**. Breaking changes are acceptable — th
 ### Phase 3: Test Suite Improvements ✅
 
 - Fixed `test_parsing.py` to test real methods
-- Added tests for `fix_applier.py`, `prompt_loader.py`, `reviewer.py`
+- Added tests for `fix_applier.py`, `reviewer.py` (incl. RULE_EVALUATION_ORDER drift detection and prompt file existence)
 - Set up CI pipeline (GitHub Actions, ruff linting, Python 3.11/3.12/3.13)
 
 ## In Progress
@@ -44,7 +44,7 @@ Focus: reduce LLM hallucinations, improve fix accuracy, move mechanical rules to
 | 4.3 Deterministic Checkers | ~13 mechanical rules via regex (zero hallucination risk) | Planned |
 | 4.4 Rule Clarity | Improve 12 rule descriptions to reduce misinterpretation | Planned |
 | 4.5 Scope Reduction | Reduce noise from overly subjective rules | Planned |
-| 4.6 Prompt Consolidation | Merge 8 identical prompt files into single `prompt.md` | Planned |
+| 4.6 Prompt Consolidation | Merge 8 identical prompt files into single `prompt.md` | **Done** (PR #17) |
 | 4.7 Extended Thinking | Claude reasons internally → 0% false positives | **Done** (v0.7.0) |
 
 ### Phase 5: Style Suggestion UX

diff --git a/docs/developer/testing.md b/docs/developer/testing.md
@@ -27,8 +27,7 @@ tests/
 ├── test_github_handler.py    # GitHub API interaction, comment parsing
 ├── test_markdown_parser.py   # LLM response parsing
 ├── test_parsing.py           # Comment trigger pattern matching
-├── test_prompt_loader.py     # Prompt/rules file loading
-├── test_reviewer.py          # Rule extraction and evaluation order
+├── test_reviewer.py          # Rule extraction, RULE_EVALUATION_ORDER, prompt file
 ├── test_llm_integration.py   # Real LLM API calls (@integration)
 └── test_cli.py               # CLI argument parsing
 ```
@@ -43,8 +42,7 @@ Run automatically with `pytest`:
 | `test_github_handler.py` | GitHub API interaction, comment parsing |
 | `test_markdown_parser.py` | LLM response parsing |
 | `test_parsing.py` | Comment trigger pattern matching (real method) |
-| `test_prompt_loader.py` | Prompt/rules file loading |
-| `test_reviewer.py` | Rule extraction and evaluation order |
+| `test_reviewer.py` | Rule extraction, RULE_EVALUATION_ORDER consistency, prompt file existence |
 | `test_cli.py` | CLI argument parsing |
 
 ### Integration Tests (Slow, Costs Money)
@@ -69,15 +67,14 @@ pytest --cov=style_checker --cov-report=html
 open htmlcov/index.html
 ```
 
-Current coverage:
+Current coverage (approximate — re-measure with `pytest --cov`):
 
 | File | Coverage |
 |------|----------|
-| `fix_applier.py` | 92% |
-| `prompt_loader.py` | 86% |
-| `github_handler.py` | 55% |
-| `reviewer.py` | 47% |
-| `action.py` | 0% (needs integration mocking) |
+| `fix_applier.py` | high |
+| `github_handler.py` | medium |
+| `reviewer.py` | medium |
+| `action.py` | 0% (needs integration mocking — tracked in TECHNICAL-REVIEW §6.1) |
 
 ## CI Pipeline
 

diff --git a/style_checker/categories.py b/style_checker/categories.py
@@ -0,0 +1,21 @@
+"""
+Single source of truth for the style-rule category names.
+
+Every category here must have a corresponding `{name}-rules.md` file in
+`style_checker/rules/` and a matching entry in
+`reviewer.RULE_EVALUATION_ORDER`. The consistency between this tuple and
+`RULE_EVALUATION_ORDER.keys()` is enforced by a test in `tests/test_reviewer.py`.
+"""
+
+# Ordered — index is the default category processing order used by
+# `StyleReviewer.review_lecture_smart`.
+VALID_CATEGORIES = (
+    "writing",
+    "math",
+    "code",
+    "jax",
+    "figures",
+    "references",
+    "links",
+    "admonitions",
+)
diff --git a/style_checker/cli.py b/style_checker/cli.py
@@ -22,16 +22,10 @@
 from datetime import datetime
 
 from style_checker import __version__
+from style_checker.categories import VALID_CATEGORIES
 from style_checker.reviewer import StyleReviewer
 
 
-# All available categories (matches reviewer.RULE_EVALUATION_ORDER keys)
-ALL_CATEGORIES = [
-    "writing", "math", "code", "jax",
-    "figures", "references", "links", "admonitions",
-]
-
-
 def display_width(s: str) -> int:
     """Calculate terminal display width, accounting for wide/emoji characters."""
     w = 0
@@ -278,13 +272,13 @@ def main():
     # Parse categories
     if args.categories:
         categories = [c.strip() for c in args.categories.split(",")]
-        invalid = [c for c in categories if c not in ALL_CATEGORIES]
+        invalid = [c for c in categories if c not in VALID_CATEGORIES]
         if invalid:
             print(f"Error: invalid categories: {', '.join(invalid)}", file=sys.stderr)
-            print(f"Valid categories: {', '.join(ALL_CATEGORIES)}", file=sys.stderr)
+            print(f"Valid categories: {', '.join(VALID_CATEGORIES)}", file=sys.stderr)
             sys.exit(1)
     else:
-        categories = list(ALL_CATEGORIES)
+        categories = list(VALID_CATEGORIES)
 
     # API key
     api_key = args.api_key or os.environ.get("ANTHROPIC_API_KEY")

diff --git a/style_checker/github_handler.py b/style_checker/github_handler.py
@@ -11,16 +11,12 @@
 from github import Github, GithubException
 from datetime import datetime
 
+from .categories import VALID_CATEGORIES
+
 
 class GitHubHandler:
     """Handles GitHub API interactions for PR and issue management"""
 
-    # Valid category names (must match files in style_checker/rules/)
-    VALID_CATEGORIES = {
-        'writing', 'math', 'code', 'jax',
-        'figures', 'references', 'links', 'admonitions'
-    }
-
     def __init__(self, token: str, repository: str):
         """
         Initialize GitHub handler
@@ -79,10 +75,10 @@ def extract_lecture_from_comment(self, comment_body: str) -> Optional[Tuple[str,
 
                     # Validate categories
                     if categories != ['all']:
-                        invalid = [c for c in categories if c not in self.VALID_CATEGORIES]
+                        invalid = [c for c in categories if c not in VALID_CATEGORIES]
                         if invalid:
                             print(f"⚠️  Invalid categories: {', '.join(invalid)}")
-                            print(f"   Valid categories: {', '.join(sorted(self.VALID_CATEGORIES))}")
+                            print(f"   Valid categories: {', '.join(sorted(VALID_CATEGORIES))}")
                             return None
                 else:
                     categories = ['all']  # Default to all categories