Consolidate VALID_CATEGORIES; delete dead prompt_loader (TECHNICAL-REVIEW §4.1, §4.2)#17
Conversation
Addresses §4.1 + §4.2 of TECHNICAL-REVIEW.md. Before: the 8-category list was defined in four places that could drift silently — `cli.ALL_CATEGORIES` (list), `GitHubHandler.VALID_CATEGORIES` (set), `PromptLoader.VALID_CATEGORIES` (list), and the keys of `reviewer.RULE_EVALUATION_ORDER` (dict). Plus a duplicate hardcoded list inside `review_lecture_smart`. Test coverage only verified that two of those four matched. After: one tuple in the new `style_checker/categories.py` module that every other call site imports. RULE_EVALUATION_ORDER still owns the per-category rule ordering, but a new test asserts its keys are exactly `VALID_CATEGORIES` in the same order — drift fails CI loudly. prompt_loader.py was dead production code: only tests imported the PromptLoader class. The runtime (`reviewer.create_single_rule_prompt`) loads prompts directly. Deleting the module + its test removes the fourth source-of-truth for the category list and ~230 lines of unmaintained code, with no behavior change. Files touched: + style_checker/categories.py (new, 22 lines — the single source) - style_checker/prompt_loader.py (deleted, was ~230 lines, unused at runtime) - tests/test_prompt_loader.py (deleted, tested the deleted class) M style_checker/cli.py (import VALID_CATEGORIES; drop ALL_CATEGORIES alias) M style_checker/github_handler.py (drop class-attr; import VALID_CATEGORIES) M style_checker/reviewer.py (drop hardcoded list in review_lecture_smart) M tests/test_cli.py (import VALID_CATEGORIES from canonical location) M tests/test_parsing.py (drop fixture's class-attr setup) M tests/test_reviewer.py (import from categories; add drift-detection test) Test suite: 85 passed (was 91; net -6 from prompt_loader test deletion + 1 new RULE_EVALUATION_ORDER consistency test).
There was a problem hiding this comment.
Pull request overview
This PR removes unused prompt-loading infrastructure and centralizes the style-checker category list into a single canonical location, reducing the chance of category “drift” across the CLI, GitHub Action, and reviewer logic.
Changes:
- Added
style_checker/categories.pyto defineVALID_CATEGORIESas the single source of truth. - Deleted dead
PromptLoadercode and its dedicated test module. - Updated CLI, GitHub handler, reviewer, and tests to import and use
VALID_CATEGORIES, plus added a drift-detection test to keepRULE_EVALUATION_ORDERaligned.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| style_checker/categories.py | Introduces canonical VALID_CATEGORIES tuple and documents intended invariants. |
| style_checker/prompt_loader.py | Deletes unused prompt loader implementation. |
| style_checker/cli.py | Switches category validation/defaulting to VALID_CATEGORIES. |
| style_checker/github_handler.py | Removes class-level category set; validates against imported VALID_CATEGORIES. |
| style_checker/reviewer.py | Replaces hardcoded default category list with VALID_CATEGORIES. |
| tests/test_cli.py | Updates tests to reference VALID_CATEGORIES instead of ALL_CATEGORIES. |
| tests/test_parsing.py | Simplifies fixture setup now that category validation is module-level. |
| tests/test_reviewer.py | Replaces PromptLoader.VALID_CATEGORIES usage and adds drift-detection test for RULE_EVALUATION_ORDER keys/order. |
| tests/test_prompt_loader.py | Deletes tests tied to removed PromptLoader. |
…(§1.4) Addresses Copilot review feedback on PR #17 by removing the root cause rather than documenting it: the prompt-file-per-category invariant. Before: 8 byte-identical files (admonitions-prompt.md, code-prompt.md, ...) sat in style_checker/prompts/. create_single_rule_prompt picked one based on the category arg, even though the content was the same for every category. The TECHNICAL-REVIEW §1.4 and roadmap §4.6 both flagged this — Copilot's two comments on PR #17 surfaced the cost (docstring drift risk + lost test coverage when prompt_loader was deleted). After: a single style_checker/prompts/prompt.md is loaded once per rule invocation, identical for every category. The `category` arg on create_single_rule_prompt is now unused but kept in the signature so callers don't need to change if a category-specific prompt is ever reintroduced. Replacement test coverage in test_reviewer.py: - test_prompt_file_exists — catches packaging regressions where the prompt file isn't shipped. - test_prompt_file_has_version_header — keeps the version-tracking invariant that test_prompt_loader used to enforce. Docs updated: - architecture.md: diagram, "Prompt Construction" section, repo tree. - extended-thinking.md: "8 identical (for now)" → "single shared". - testing.md: drop test_prompt_loader.py reference; refresh coverage. - roadmap.md: mark §4.6 (prompt consolidation) as Done (PR #17). - contributing.md: update "Adding a New Category" steps (no more category-prompt.md to create; mention categories.py and the test that catches RULE_EVALUATION_ORDER drift). - tests/README.md: same content updates + uv install command. Net: -7 files (deleted 8 .md, added 1 .md), reviewer.py simpler, docs aligned with current architecture.
|
Expanded scope to address Copilot's feedback at the root cause. New commit Why expand scopeBoth Copilot comments pointed at the prompt-file-per-category invariant:
Rather than document/test an invariant that the roadmap already flagged as work-to-remove, this commit removes the invariant entirely — one shared What's in
|
| Change | Notes |
|---|---|
style_checker/prompts/prompt.md |
New canonical single shared prompt |
style_checker/prompts/{8 categories}-prompt.md |
Deleted (were byte-identical) |
style_checker/reviewer.py |
create_single_rule_prompt now loads the single file |
tests/test_reviewer.py |
New TestPromptFile class with test_prompt_file_exists + test_prompt_file_has_version_header |
| 6 doc updates | architecture, contributing, testing, roadmap, extended-thinking, tests/README — all stale references to the 8-file structure and to prompt_loader.py are removed |
Test status
- 87 tests pass (was 85; +2 new prompt-file tests)
- Verified
create_single_rule_prompt('writing', ...) == create_single_rule_prompt('math', ...)— consolidation produces identical prompts as intended - CI green on the previous commit; new commit waiting for CI
Diff stats
This commit: +87 / −346 across 16 files (mostly deletions of duplicate prompt content + doc cleanup).
Full PR (both commits): +153 / −706 across 17 files (~5× more code removed than added).
The 'Prompt Files' prose still described 8 identical per-category files and listed the consolidation as 'planned'. #17 completed it: update the location to the single prompt.md and reword the design/version notes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
) (#22) * docs: fix stale repo structure in copilot-instructions Follow-up to #17. The repository-structure diagram still listed the deleted prompt_loader.py and the per-category {name}-prompt.md files. Replace with the consolidated layout: categories.py (single source of truth for VALID_CATEGORIES) and the single shared prompts/prompt.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs: update Prompt Files section for consolidated prompt The 'Prompt Files' prose still described 8 identical per-category files and listed the consolidation as 'planned'. #17 completed it: update the location to the single prompt.md and reword the design/version notes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Summary
Addresses §4.1 and §4.2 of TECHNICAL-REVIEW.md:
style_checker/prompt_loader.py. It was dead production code: only the test suite imported it. The runtime path (reviewer.create_single_rule_prompt) loads prompts directly from disk.style_checker/categories.py.What changed
Before: the category list was defined in four places that could drift silently —
cli.ALL_CATEGORIES(list)GitHubHandler.VALID_CATEGORIES(set, class attribute)PromptLoader.VALID_CATEGORIES(list, on the dead class)reviewer.RULE_EVALUATION_ORDERkeys (dict)review_lecture_smartTest coverage only verified that two of those four agreed.
After: one tuple, in one file:
RULE_EVALUATION_ORDERstill owns the per-category rule ordering, but a new test (test_rule_evaluation_order_keys_match_valid_categories) asserts its keys are exactlyVALID_CATEGORIESin the same order — drift fails CI loudly on the next edit.Files
style_checker/categories.pystyle_checker/prompt_loader.pytests/test_prompt_loader.pystyle_checker/cli.pyVALID_CATEGORIES; dropALL_CATEGORIESaliasstyle_checker/github_handler.pystyle_checker/reviewer.pyreview_lecture_smarttests/test_cli.pytests/test_parsing.pytests/test_reviewer.pyNet diff: +66 / −360 across 9 files.
Test plan
uv run pytest tests/→ 85 passedVALID_CATEGORIESresolves to the canonical tupletest_rule_evaluation_order_keys_match_valid_categoriescatches drift between the two definitionstest-action-style-guideissue TASK: Evaluate individual style-guide rules to improve #1 (post-merge)Why now
Surfaced during the deep technical review (PR #15 was the first batch — P0/P1 correctness bugs). This is the next-priority architectural cleanup item: low-risk, mostly mechanical, and removes 4× the surface area for the "I forgot to update one of the lists" class of bug.
🤖 Generated with Claude Code