phase-gate

A Claude Code plugin. You write a phased plan, the implementer runs it, and a row-based verification file decides whether each phase actually closed.

/plan         Write a phased plan with row-by-row success criteria
/implement    Execute one phase at a time against an explicit file manifest
/validate     Re-run every verification row; the gate flips when all pass
/debug-sweep  Adversarial agent passes flag bugs. Flag-only, no auto-fix.

The artifact that holds the loop together is verification.jsonl. Each success criterion in the plan becomes one row. A phase cannot close until its rows are green. The same file is reused at the end as the validation report.

Install

git clone https://github.com/keez97/phase-gate ~/.claude/plugins/phase-gate

Claude Code auto-discovers plugins under ~/.claude/plugins/. The four slash commands and three agents become available in any project on next session start.

To verify the verification spine is healthy on your machine — append, run, flip, and severity classification all working end-to-end — run the smoke test:

bash ~/.claude/plugins/phase-gate/scripts/smoke-test.sh

Ten assertions, no external dependencies beyond bash, git, and jq.

To work on phase-gate locally and try it against a real project:

ln -s ~/code/phase-gate ~/.claude/plugins/phase-gate-dev

What ships

Path	What it is
`commands/plan.md`	`/plan`. Phased plan plus verification.jsonl.
`commands/implement.md`	`/implement`. Phase-by-phase execution with inline verification.
`commands/validate.md`	`/validate`. Full-plan verification rerun and report.
`commands/debug-sweep.md`	`/debug-sweep`. N adversarial sweeps over the post-implement diff.
`agents/implementer.md`	Bounded coding agent. Reads the bug catalog before editing. Refuses off-manifest edits.
`agents/debugger.md`	Adversarial sweep agent. Parametrized by lens. Never modifies code.
`agents/verifier.md`	Rubric-based gate evaluator. Reads one deliverable plus one rubric.
`scripts/verification-*.sh`	Append, run, and flip rows. This is the verification spine.
`scripts/debug-sweep.sh`	Sweep orchestrator. 30-min watchdog. Per-plan lock.
`scripts/debug-severity.sh`	Diff classifier. Picks 0, 2, 3, or 4 sweeps from what changed.
`scripts/select-catalog-tags.sh`	Ranks catalog entries by manifest relevance for prompt injection.
`scripts/mine-sweep-patterns.sh`	Counts recurring bug tags across plans.
`reference/verification-schema.md`	Row contract: id, rubric, check-type, evidence requirements.
`reference/bug-catalog.md`	Seven seeded universal patterns. Add more as your project hits them.

Quick start

In a project where you've installed phase-gate:

/plan add a rate limiter to the public API

This produces <plan-dir>/plan.md (3+ phases, success criteria as bullets) and <plan-dir>/verification.jsonl (one row per criterion). Edit either before continuing.

/implement <plan-dir>/plan.md

The implementer runs each phase against an explicit file manifest. After the phase, it appends evidence quotes to the matching verification rows and flips them. If a row stays red, the phase doesn't close. Fix it or re-plan.

/debug-sweep <plan-dir>/plan.md

Sweep severity is auto-classified from the post-implement diff. Findings land in <plan-dir>/sweeps/ as JSONL plus a markdown summary. The agent never modifies code.

/validate <plan-dir>/plan.md

Re-runs every row against the current state and produces a validation report. Pass or fail.

Why row-by-row verification

Plans written in plain markdown drift. Agents are good at rewriting the plan to match what they did rather than what the plan said. Phase-gate works around this by keeping success criteria in a structured file that the implementer is forbidden from rewriting. The implementer can flip a passes field via the verification-flip.sh helper and append evidence, nothing else.

Seven check_type values are supported, grouped by who runs them. The runner (verification-run.sh) executes behavioral, structural, artifact, test, and e2e rows by running their check_command and gating on exit zero. The orchestrator (/implement and /validate) handles semantic rows by spawning a fresh verifier agent with a rubric, and agent-smoke rows by spawning the agent under test on a synthetic task. Every row carries an id, a phase, a criterion, a check_type, and either a check_command or (for semantic rows) a rubric array.

See docs/concepts.md for the full mental model and reference/verification-schema.md for the exact row contract.

Adversarial sweeps

After /implement, run /debug-sweep against the plan. It fires N sequential headless Claude sessions, each with a different lens.

Severity	Sweeps	Lenses
low	0	(skipped on docs-only diff)
medium	2	correctness, robustness
high	3	+ architecture
critical	4	+ adversarial

Severity is auto-classified from the diff. Override with --severity critical if you want all four lenses regardless. Each sweep reads the plan, the changed files, and one or two hops of dependencies, then writes JSONL findings tagged from reference/bug-catalog.md. The next sweep gets the prior findings as an exclusion scope so it spends its budget on new ground rather than re-flagging.

Sweeps do not modify code. The output is a list of findings, and you decide which ones are real.

See docs/lens-design.md for the lens contracts and how to add your own.

What this is not

Phase-gate is not a test runner. Verification rows are gates that ask "did the plan produce what it said it would," not "is this code correct in the small." Use your existing test framework for the latter.

It is also not a linter. The debug sweep flags pattern-recurring bugs and architecture drift, not formatting or naming.

There is no DSL, no state machine, no orchestration layer beyond verification.jsonl. Four commands, three agents, a handful of shell scripts.

Customizing

Edit reference/bug-catalog.md to add entries when a pattern shows up in two or more of your plans. The schema is documented in the file header.

Populate scripts/debug-severity-paths.txt with globs that should auto-promote to critical severity. Auth, payments, anything load-bearing.

Edit agents/debugger.md § "Lens contracts" to change what each sweep looks for, or to add new lenses for your project.

See docs/customization.md for the full surface area.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.claude-plugin		.claude-plugin
agents		agents
commands		commands
docs		docs
examples/add-rate-limiting		examples/add-rate-limiting
reference		reference
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

phase-gate

Install

What ships

Quick start

Why row-by-row verification

Adversarial sweeps

What this is not

Customizing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

phase-gate

Install

What ships

Quick start

Why row-by-row verification

Adversarial sweeps

What this is not

Customizing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages