Engineering manager-mode review that locks in execution architecture, data flow, diagrams, edge cases, test coverage, and performance before any code is written. Walks through issues interactively with opinionated recommendations.
| Property | Value |
|---|---|
| Trigger | Comment /plan-eng-review on an issue. Also runs as part of /autoplan. |
| Browser Required | No |
| Default State | ✅ Enabled |
| Results Path | Review log at .github-gstack-intelligence/state/results/review/review-log.json |
- Create or open an issue containing your plan (or reference a plan file on a branch).
- Comment
/plan-eng-reviewon the issue. - The skill runs a scope challenge first, then walks through each review section interactively.
- Each issue is presented one at a time with options, a recommendation, and reasoning mapped to engineering preferences.
Best for: When you have a plan or design doc and are about to start coding — use this to catch architecture issues, missing edge cases, and gaps in test coverage before implementation.
- DRY is important — repetition flagged aggressively.
- Well-tested code is non-negotiable — rather too many tests than too few.
- "Engineered enough" — not under-engineered (fragile) or over-engineered (premature abstraction).
- More edge cases, not fewer — thoughtfulness > speed.
- Explicit over clever — 10-line obvious fix > 200-line abstraction.
- Minimal diff — achieve the goal with fewest new abstractions and files touched.
- State diagnosis — Teams: falling behind, treading water, repaying debt, innovating (Larson)
- Blast radius instinct — Worst case + how many systems affected
- Boring by default — "Three innovation tokens" — everything else is proven tech (McKinley)
- Incremental over revolutionary — Strangler fig, not big bang (Fowler)
- Systems over heroes — Design for tired humans at 3am, not best engineer on best day
- Reversibility preference — Feature flags, A/B tests, incremental rollouts
- Failure is information — Blameless postmortems, error budgets, chaos engineering (Allspaw, Google SRE)
- Org structure IS architecture — Conway's Law in practice (Team Topologies)
- DX is product quality — Slow CI, bad local dev → worse software, higher attrition
- Essential vs accidental complexity — "Is this solving a real problem or one we created?" (Brooks)
- Two-week smell test — Can't ship a small feature in two weeks? Onboarding problem disguised as architecture
- Glue work awareness — Recognize invisible coordination work (Reilly)
- Make the change easy, then make the easy change — Refactor first, implement second (Beck)
- Own your code in production — No wall between dev and ops (Majors)
- Error budgets over uptime targets — SLO budget to spend on shipping (Google SRE)
Pre-review:
- Design doc check — looks for existing design docs from
/office-hours. - Reads
CLAUDE.md,TODOS.md, recent git history. - Retrospective check for prior review cycles.
- Frontend/UI scope detection for design-related review.
Step 0: Scope Challenge
- Existing code leverage — Maps sub-problems to existing code.
- Minimum change set — Flags work that could be deferred.
- Complexity check — 8+ files or 2+ new classes/services = smell.
- Search check — Built-in alternatives, current best practices, known pitfalls.
- TODOS cross-reference — Deferred items blocking or bundleable with this plan.
- Completeness check — Shortcut vs complete version (AI-assisted coding makes completeness cheap).
- Distribution check — New artifact types need build/publish pipelines.
If complexity check triggers, recommends scope reduction via interactive discussion.
Review Sections (4 sections, interactive):
| Section | Focus | Key outputs |
|---|---|---|
| 1. Architecture | System design, dependency graph, data flow, scaling, security, production failure scenarios, distribution architecture | ASCII architecture diagram |
| 2. Code Quality | DRY violations, module structure, error handling, edge cases, over/under-engineering, stale diagram audit | Concrete file/line references |
| 3. Test Review | Test diagram of ALL new flows/codepaths/branches, gap analysis, failure path tests, pyramid check, flakiness risk, LLM eval requirements | Complete test coverage map |
| 4. Performance | N+1 queries, memory usage, database indexes, caching, background job sizing, slow paths, connection pool pressure | Top 3 slowest codepaths |
Each section follows a strict pattern: STOP after presenting findings → one issue = one question → options with effort/risk/maintenance → recommendation mapped to engineering preferences → wait for response before proceeding.
- "NOT in scope" — Deferred work with one-line rationale each.
- "What already exists" — Existing code/flows mapped to plan sub-problems.
- TODOS.md updates — Each TODO presented individually with What, Why, Pros, Cons, Context, Dependencies.
- Diagrams — ASCII diagrams for non-trivial data flow, state machines, processing pipelines.
- Failure modes — For each new codepath: realistic failure scenario, test coverage, error handling, user visibility. Silent + untested + unhandled = critical gap.
- Worktree parallelization strategy — Dependency table, parallel lanes, execution order, conflict flags for parallel implementation.
- Completion summary — All findings at a glance with scope, issues, gaps, outside voice status, lake score.
After completion, persists review metadata to .github-gstack-intelligence/state/results/review/review-log.json for the review readiness dashboard in /ship.
## Engineering Plan Review — Issue #35
### Step 0: Scope Challenge
- Existing code: `UserSync` service already handles 60% of the sync logic
- Minimum diff: 6 files (under 8-file threshold ✅)
- Completeness: Plan proposes shortcut on error handling → recommend complete version (cheap with AI)
- TODOS: #12 (rate limiting) can be bundled — same blast radius
### Section 1: Architecture — 2 issues found
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Client │────▶│ API GW │────▶│ Worker │
└──────────┘ └──────────┘ └──────────┘
│ │
│ ┌──────────┐ │
└──────────▶│ Redis │◀──────────┘
└──────────┘
Issue 1: Worker ↔ Redis connection has no backpressure mechanism
→ A) Add queue depth limit (recommended, P5: explicit)
→ B) Rely on Redis memory limit (risky)
→ C) Do nothing (⚠️ fails under 10x load)
### Section 3: Test Review
NEW CODEPATHS:
- UserSync#perform (happy path)
- UserSync#perform (API timeout)
- UserSync#perform (rate limit 429)
- UserSync#perform (malformed response)
Coverage: 3/4 paths tested. Gap: malformed response → critical gap
### Completion Summary
Step 0: Scope accepted as-is
Architecture: 2 issues found (resolved)
Code Quality: 1 issue found (DRY violation)
Test Review: 1 gap (malformed response path)
Performance: 0 issues
Failure modes: 1 critical gap flagged
Parallelization: 2 lanes, both parallel
Lake Score: 4/4 chose complete optionIn config.json:
{
"skills": {
"plan-eng-review": {
"enabled": true,
"trigger": "issue_comment"
}
}
}| Field | Description |
|---|---|
enabled |
Whether the skill is active (true/false). |
trigger |
Event type — issue_comment means it fires when /plan-eng-review is commented on an issue. |
- Browser: Not required.
- Model: Uses the model configured in
config.jsondefaults (currentlygpt-5.4). - Allowed tools: Read, Write, Grep, Glob, Bash, WebSearch.
- Benefits from: Prior
/office-hoursoutput for problem context and design docs.
- Plan file — Updated in-place with review findings, diagrams, and output sections.
- Review log — Written to
.github-gstack-intelligence/state/results/review/review-log.jsonfor/shipreadiness dashboard. - TODOS.md — Updated with deferred items including effort estimates and priority.
- Required outputs: "NOT in scope," "What already exists," test diagram, failure modes registry, worktree parallelization strategy, completion summary.
- Skill prompt:
../skills/plan-eng-review.md - Config:
../config.json - Router:
../lifecycle/router.ts - References:
../skills/references/ - TODOS format:
../skills/references/review-todos-format.md
/autoplan— Runs CEO + Design + Eng reviews automatically with auto-decisions/plan-ceo-review— CEO/founder review for strategy, scope, and premises/plan-design-review— Designer review for UI/UX completeness/office-hours— Run first to establish product direction and constraints