Skip to content

Consolidate VALID_CATEGORIES; delete dead prompt_loader (TECHNICAL-REVIEW §4.1, §4.2)#17

Merged
mmcky merged 2 commits into
mainfrom
refactor/consolidate-categories
Jun 5, 2026
Merged

Consolidate VALID_CATEGORIES; delete dead prompt_loader (TECHNICAL-REVIEW §4.1, §4.2)#17
mmcky merged 2 commits into
mainfrom
refactor/consolidate-categories

Conversation

@mmcky

@mmcky mmcky commented May 25, 2026

Copy link
Copy Markdown
Contributor

Summary

Addresses §4.1 and §4.2 of TECHNICAL-REVIEW.md:

  • §4.1 — Delete style_checker/prompt_loader.py. It was dead production code: only the test suite imported it. The runtime path (reviewer.create_single_rule_prompt) loads prompts directly from disk.
  • §4.2 — Consolidate the four (!) sources of truth for the 8-category list into one tuple in style_checker/categories.py.

What changed

Before: the category list was defined in four places that could drift silently —

  • cli.ALL_CATEGORIES (list)
  • GitHubHandler.VALID_CATEGORIES (set, class attribute)
  • PromptLoader.VALID_CATEGORIES (list, on the dead class)
  • reviewer.RULE_EVALUATION_ORDER keys (dict)
  • plus a duplicate hardcoded list inside review_lecture_smart

Test coverage only verified that two of those four agreed.

After: one tuple, in one file:

# style_checker/categories.py
VALID_CATEGORIES = (
    "writing", "math", "code", "jax",
    "figures", "references", "links", "admonitions",
)

RULE_EVALUATION_ORDER still owns the per-category rule ordering, but a new test (test_rule_evaluation_order_keys_match_valid_categories) asserts its keys are exactly VALID_CATEGORIES in the same order — drift fails CI loudly on the next edit.

Files

Change File Notes
NEW style_checker/categories.py 22 lines — the single source
DELETE style_checker/prompt_loader.py ~230 lines, unused at runtime
DELETE tests/test_prompt_loader.py Tested the deleted class
modified style_checker/cli.py Import VALID_CATEGORIES; drop ALL_CATEGORIES alias
modified style_checker/github_handler.py Drop class-attr; module-level import
modified style_checker/reviewer.py Drop hardcoded list in review_lecture_smart
modified tests/test_cli.py Import from canonical location
modified tests/test_parsing.py Drop fixture's class-attr setup
modified tests/test_reviewer.py Import from canonical + add drift-detection test

Net diff: +66 / −360 across 9 files.

Test plan

  • Unit tests pass: uv run pytest tests/85 passed
  • All modules import cleanly
  • Every call site of VALID_CATEGORIES resolves to the canonical tuple
  • New test_rule_evaluation_order_keys_match_valid_categories catches drift between the two definitions
  • Smoke test via test-action-style-guide issue TASK: Evaluate individual style-guide rules to improve #1 (post-merge)

Why now

Surfaced during the deep technical review (PR #15 was the first batch — P0/P1 correctness bugs). This is the next-priority architectural cleanup item: low-risk, mostly mechanical, and removes 4× the surface area for the "I forgot to update one of the lists" class of bug.

🤖 Generated with Claude Code

Addresses §4.1 + §4.2 of TECHNICAL-REVIEW.md.

Before: the 8-category list was defined in four places that could drift
silently — `cli.ALL_CATEGORIES` (list), `GitHubHandler.VALID_CATEGORIES`
(set), `PromptLoader.VALID_CATEGORIES` (list), and the keys of
`reviewer.RULE_EVALUATION_ORDER` (dict). Plus a duplicate hardcoded list
inside `review_lecture_smart`. Test coverage only verified that two of
those four matched.

After: one tuple in the new `style_checker/categories.py` module that
every other call site imports. RULE_EVALUATION_ORDER still owns the
per-category rule ordering, but a new test asserts its keys are exactly
`VALID_CATEGORIES` in the same order — drift fails CI loudly.

prompt_loader.py was dead production code: only tests imported the
PromptLoader class. The runtime (`reviewer.create_single_rule_prompt`)
loads prompts directly. Deleting the module + its test removes the
fourth source-of-truth for the category list and ~230 lines of
unmaintained code, with no behavior change.

Files touched:
  + style_checker/categories.py        (new, 22 lines — the single source)
  - style_checker/prompt_loader.py     (deleted, was ~230 lines, unused at runtime)
  - tests/test_prompt_loader.py        (deleted, tested the deleted class)
  M style_checker/cli.py               (import VALID_CATEGORIES; drop ALL_CATEGORIES alias)
  M style_checker/github_handler.py    (drop class-attr; import VALID_CATEGORIES)
  M style_checker/reviewer.py          (drop hardcoded list in review_lecture_smart)
  M tests/test_cli.py                  (import VALID_CATEGORIES from canonical location)
  M tests/test_parsing.py              (drop fixture's class-attr setup)
  M tests/test_reviewer.py             (import from categories; add drift-detection test)

Test suite: 85 passed (was 91; net -6 from prompt_loader test deletion
+ 1 new RULE_EVALUATION_ORDER consistency test).
Copilot AI review requested due to automatic review settings May 25, 2026 05:46

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes unused prompt-loading infrastructure and centralizes the style-checker category list into a single canonical location, reducing the chance of category “drift” across the CLI, GitHub Action, and reviewer logic.

Changes:

  • Added style_checker/categories.py to define VALID_CATEGORIES as the single source of truth.
  • Deleted dead PromptLoader code and its dedicated test module.
  • Updated CLI, GitHub handler, reviewer, and tests to import and use VALID_CATEGORIES, plus added a drift-detection test to keep RULE_EVALUATION_ORDER aligned.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
style_checker/categories.py Introduces canonical VALID_CATEGORIES tuple and documents intended invariants.
style_checker/prompt_loader.py Deletes unused prompt loader implementation.
style_checker/cli.py Switches category validation/defaulting to VALID_CATEGORIES.
style_checker/github_handler.py Removes class-level category set; validates against imported VALID_CATEGORIES.
style_checker/reviewer.py Replaces hardcoded default category list with VALID_CATEGORIES.
tests/test_cli.py Updates tests to reference VALID_CATEGORIES instead of ALL_CATEGORIES.
tests/test_parsing.py Simplifies fixture setup now that category validation is module-level.
tests/test_reviewer.py Replaces PromptLoader.VALID_CATEGORIES usage and adds drift-detection test for RULE_EVALUATION_ORDER keys/order.
tests/test_prompt_loader.py Deletes tests tied to removed PromptLoader.

Comment thread style_checker/categories.py
Comment thread tests/test_reviewer.py
…(§1.4)

Addresses Copilot review feedback on PR #17 by removing the root cause
rather than documenting it: the prompt-file-per-category invariant.

Before: 8 byte-identical files (admonitions-prompt.md, code-prompt.md,
...) sat in style_checker/prompts/. create_single_rule_prompt picked
one based on the category arg, even though the content was the same
for every category. The TECHNICAL-REVIEW §1.4 and roadmap §4.6 both
flagged this — Copilot's two comments on PR #17 surfaced the cost
(docstring drift risk + lost test coverage when prompt_loader was
deleted).

After: a single style_checker/prompts/prompt.md is loaded once per
rule invocation, identical for every category. The `category` arg on
create_single_rule_prompt is now unused but kept in the signature so
callers don't need to change if a category-specific prompt is ever
reintroduced.

Replacement test coverage in test_reviewer.py:
- test_prompt_file_exists — catches packaging regressions where the
  prompt file isn't shipped.
- test_prompt_file_has_version_header — keeps the version-tracking
  invariant that test_prompt_loader used to enforce.

Docs updated:
- architecture.md: diagram, "Prompt Construction" section, repo tree.
- extended-thinking.md: "8 identical (for now)" → "single shared".
- testing.md: drop test_prompt_loader.py reference; refresh coverage.
- roadmap.md: mark §4.6 (prompt consolidation) as Done (PR #17).
- contributing.md: update "Adding a New Category" steps (no more
  category-prompt.md to create; mention categories.py and the test
  that catches RULE_EVALUATION_ORDER drift).
- tests/README.md: same content updates + uv install command.

Net: -7 files (deleted 8 .md, added 1 .md), reviewer.py simpler,
docs aligned with current architecture.
@mmcky

mmcky commented May 25, 2026

Copy link
Copy Markdown
Contributor Author

Expanded scope to address Copilot's feedback at the root cause. New commit 01c50cf pulls in TECHNICAL-REVIEW §1.4 (roadmap §4.6) — consolidating the 8 byte-identical prompt files into a single shared prompts/prompt.md.

Why expand scope

Both Copilot comments pointed at the prompt-file-per-category invariant:

  • Comment 1: "docstring should mention {name}-prompt.md is required"
  • Comment 2: "we lost the test that verified each {name}-prompt.md exists"

Rather than document/test an invariant that the roadmap already flagged as work-to-remove, this commit removes the invariant entirely — one shared prompt.md, no per-category prompt files at all. The category arg on create_single_rule_prompt is kept (unused) so future callers don't need to change if a category-specific prompt is ever reintroduced.

What's in 01c50cf

Change Notes
style_checker/prompts/prompt.md New canonical single shared prompt
style_checker/prompts/{8 categories}-prompt.md Deleted (were byte-identical)
style_checker/reviewer.py create_single_rule_prompt now loads the single file
tests/test_reviewer.py New TestPromptFile class with test_prompt_file_exists + test_prompt_file_has_version_header
6 doc updates architecture, contributing, testing, roadmap, extended-thinking, tests/README — all stale references to the 8-file structure and to prompt_loader.py are removed

Test status

  • 87 tests pass (was 85; +2 new prompt-file tests)
  • Verified create_single_rule_prompt('writing', ...) == create_single_rule_prompt('math', ...) — consolidation produces identical prompts as intended
  • CI green on the previous commit; new commit waiting for CI

Diff stats

This commit: +87 / −346 across 16 files (mostly deletions of duplicate prompt content + doc cleanup).
Full PR (both commits): +153 / −706 across 17 files (~5× more code removed than added).

@mmcky mmcky merged commit f5fded1 into main Jun 5, 2026
4 checks passed
@mmcky mmcky deleted the refactor/consolidate-categories branch June 5, 2026 04:36
mmcky added a commit that referenced this pull request Jun 5, 2026
The 'Prompt Files' prose still described 8 identical per-category files
and listed the consolidation as 'planned'. #17 completed it: update the
location to the single prompt.md and reword the design/version notes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mmcky added a commit that referenced this pull request Jun 5, 2026
) (#22)

* docs: fix stale repo structure in copilot-instructions

Follow-up to #17. The repository-structure diagram still listed the
deleted prompt_loader.py and the per-category {name}-prompt.md files.
Replace with the consolidated layout: categories.py (single source of
truth for VALID_CATEGORIES) and the single shared prompts/prompt.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs: update Prompt Files section for consolidated prompt

The 'Prompt Files' prose still described 8 identical per-category files
and listed the consolidation as 'planned'. #17 completed it: update the
location to the single prompt.md and reword the design/version notes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants