This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Flow-Code is a Claude Code plugin for structured, plan-first development. The primary entry point is /flow-code:go (full autopilot: brainstorm → plan → work → review → close), plus individual slash commands, skills, and agents that orchestrate task tracking via a .flow/ directory. Core engine is a Rust binary (flowctl) with file-based JSON storage and MCP server support.
commands/flow-code/*.md → Slash command definitions (user-invocable entry points)
skills/*/SKILL.md → Utility & assessment skills (brainstorm, interview, debug, map, etc.)
codex/skills/*/SKILL.md → Core workflow skills (plan, work, plan-review, impl-review, epic-review)
agents/*.md → Subagent definitions (research scouts, worker, plan-sync, etc.)
bin/flowctl → Rust binary (built from flowctl/ workspace)
flowctl/ → Rust Cargo workspace (2 crates: core, cli)
hooks/hooks.json → Ralph workflow guards (active when FLOW_RALPH=1)
docs/ → Architecture docs, CI examples
Skill directories: Both skills/ and codex/skills/ are scanned. Core workflow skills (plan, work, reviews) live in codex/skills/ for Codex sync compatibility. Both directories are authoritative.
Skills: See docs/skills.md for the full classification. Core workflow: flow-code-run (unified phase loop via flowctl phase next/done).
Key invariant: The bin/flowctl Rust binary is the single source of truth for .flow/ state. Always invoke as:
FLOWCTL="${DROID_PLUGIN_ROOT:-${CLAUDE_PLUGIN_ROOT}}/bin/flowctl"
$FLOWCTL <command>/flow-code:go "idea" — full autopilot from raw idea to PR. Runs brainstorm (AI self-interview) → plan → plan-review → work → impl-review → close via flowctl phase next/done. Zero human input. For existing epics or spec files, brainstorm is auto-skipped.
Ralph (/flow-code:ralph-init) is the autonomous harness that runs this loop unattended.
Every epic passes through three independent, non-overlapping review gates:
| Layer | Tool | When | What it catches |
|---|---|---|---|
| 1. Guard | flowctl guard |
Worker Phase 6, wave checkpoint, close phase | Lint, type errors, test failures |
| 2. RP Plan-Review | RP context_builder or Codex | Plan phase | Spec-code misalignment, missing requirements |
| 3. Codex Adversarial | flowctl codex adversarial |
Epic completion | Security, concurrency, edge cases (different model family) |
All three must pass (or be skipped via flowctl config set review.backend none). Layers are complementary — guard catches syntax, RP catches spec drift, Codex catches blind spots. Zero-findings-halt rule: review cycles stop immediately when no findings remain, eliminating unnecessary re-review iterations. Conversely, zero findings from an adversarial review triggers re-analysis (review may be insufficient).
| Flag | Accepted by | Effect |
|---|---|---|
--auto |
brainstorm | AI self-interview, zero human input |
--plan-only |
go, plan, run | Stop after plan phase |
--no-pr |
go, run | Skip draft PR creation at close |
--tdd |
work, run | Force test-first development (worker Phase 4) |
--interactive |
plan | Opt-in interview before planning |
--no-capability-scan |
plan | Skip capability-scout |
--research=rp|grep |
plan | Override research backend |
--depth=short|standard|deep |
plan | Override plan depth |
--review=rp|codex|none |
plan, run | Override review backend |
--quick |
go, run | Fast path for trivial changes (skip brainstorm, plan review, impl review) |
--interactive |
go, run | Pause at key decisions for user confirmation |
Workers communicate with the coordinator via SendMessage with summary prefixes:
| Worker → Coordinator | Coordinator Response |
|---|---|
"Task complete: fn-N.M" |
Verify status=done, unlock files, advance wave |
"Blocked: fn-N.M" |
Log reason, skip task in current wave |
"Spec conflict: fn-N.M" |
Fix spec → "Spec updated: fn-N.M" or skip → "Task skipped: fn-N.M" |
"Need file access: path" |
"Access granted: path" or "Access denied: path" |
"Need mutation: fn-N.M" |
Execute split/skip/dep change, reply with result |
Approval timeouts: file access 120s, spec conflict 120s, mutation 300s. On timeout → worker self-blocks and stops.
# Smoke tests (flowctl core)
bash scripts/smoke_test.sh
# Full CI tests (flowctl + ralph helpers + symbol extraction)
bash scripts/ci_test.sh
# Teams e2e tests (file locking, ownership, protocol)
bash scripts/teams_e2e_test.sh
# Wave checkpoint tests (lock lifecycle, dependency unblock, stale lock recovery)
bash scripts/wave_checkpoint_test.sh
# Ralph e2e tests
bash scripts/ralph_e2e_test.sh
bash scripts/ralph_e2e_rp_test.sh # RepoPrompt backend
bash scripts/ralph_e2e_short_rp_test.shAll tests create temp directories and clean up after themselves. They must NOT be run from the plugin repo root (safety check enforced).
Storage runtime: All state is JSON/JSONL files in .flow/, readable by any tool (MCP, Read, Grep). No database, no async runtime. The json_store module in flowctl-core handles all file I/O.
# Build and test Rust flowctl
cd flowctl && cargo build --release && cargo test --all
# Validate JSON
python3 -c "import json; json.load(open('hooks/hooks.json'))"Rust: clippy for linting, cargo test for tests. No TypeScript, no npm. Skills and agents are Markdown files (no build step).
- flowctl outputs JSON (
--jsonflag) for machine consumption by skills/agents - State machine: tasks follow
todo → in_progress → done(withblockedandskippedside-states) - Evidence-based completion:
flowctl donerequires--summary-fileand--evidence-json - Wave-Checkpoint-Wave: workers execute task batches in parallel with checkpoint gates
- Plan review gating:
flowctl next --require-plan-reviewblocks work until plan is reviewed - Architecture invariants: immutable rules registered via
flowctl invariant addwith verify commands - Gap registry: epics carry a
gapsfield managed viaflowctl gap, enforced at epic close - Task restart:
flowctl restart <task-id>resets a task and cascades to all downstream dependents (--dry-run,--force) - Runtime DAG mutation:
flowctl task split <id> --titles "A|B|C" --chainsplits task into sub-tasks;flowctl task skip <id> --reasonmarks task as skipped (downstream deps treat as satisfied);flowctl dep rm <task> <dep>removes a dependency. Workers request mutations via "Need mutation:" protocol message - Git diff snapshots: worker agent captures baseline rev before implementation and
workspace_changesin evidence - Review comparison:
flowctl review-backend --compare <files>or--epic <id>detects consensus/conflict across review receipts (auto-archived to.flow/reviews/) - Domain tagging:
flowctl task create --domain <domain>tags tasks (frontend/backend/architecture/testing/docs/ops/general), filterable viatasks --domain - Epic archival:
flowctl epic archive <id>moves closed epic + tasks + specs + reviews to.flow/.archive/;flowctl epic cleanarchives all closed epics at once - Learning loop: plan injects memory (Step 6), worker saves lessons (Phase 11, included in default sequence when memory.enabled is true), epic close prompts retro, retro verifies stale entries via
flowctl memory verify <id> - Task duration:
flowctl doneauto-tracksduration_secondsfrom start to completion, rendered in evidence - File ownership:
flowctl task create --files <paths>declares owned files;flowctl files <id>shows ownership map + conflict detection - File locking (Teams):
flowctl lock --task <id> --files <paths>acquires runtime file locks;flowctl unlock --task <id>releases on completion;flowctl lock-check --file <path>inspects lock state;flowctl unlock --allclears all locks between waves - Agent Teams mode:
/flow-code:go(or legacy/flow-code:work) spawns workers as Agent Team teammates with plain-text protocol messages (summary-prefix routing: "Task complete:", "Spec conflict:", "Blocked:", "Need file access:", "New task:", "Access granted/denied:", nativeshutdown_request) and file lock enforcement - Adversarial review:
flowctl codex adversarial --base main [--focus "area"]runs Codex in adversarial mode — tries to break the code, not validate it. Returns SHIP/NEEDS_WORK with grounded findings - Three-layer quality system: Layer 1:
flowctl guard(deterministic lint/type/test — runs at Worker Phase 6, wave checkpoint, and close phase). Layer 2: RP plan-review (code-aware spec validation, invoked via/flow-code:goplan-review phase — RP sees full codebase via context_builder). Layer 3:flowctl codex adversarial(cross-model adversarial, epic completion — different model family catches blind spots). Spec conflicts and blockers forwarded to Codex for autonomous decision-making. - Review circuit breaker: Plan review max 2 iterations, impl review max 3, epic review max 2 — prevents infinite NEEDS_WORK cycles. After max iterations, pipeline proceeds with warning
- Review backend resolution: All review phases use the same priority chain:
--reviewflag >FLOW_REVIEW_BACKENDenv >.flow/config.json> defaultnone. The--no-reviewflag is equivalent to--review=noneand always wins - Auto-improve analysis-driven: generates custom program.md from codebase analysis (hotspots, lint, coverage, memory) with Action Catalog ranked by impact — not static templates
- Auto-improve quantitative: captures before/after metrics per experiment, commit messages include delta
[lint:23→21] - Worker self-review: Phase 6 runs guard + structured diff review (correctness, quality, performance, testing) before commit
- Plan auto-execute:
/flow-code:go(or legacy/flow-code:plan) defaults to auto-execute work after planning (Teams mode handles any task count);--plan-onlyto opt out - Goal-backward verification: worker Phase 10 re-reads acceptance criteria and verifies each is actually satisfied before completing
- Full-auto by default:
/flow-code:gorequires zero interactive questions — AI reads git state,.flow/config, and request context to make branch, review, and research decisions autonomously. Default mode is Worktree + Teams + Phase-Gate (all three active). Work resumes from.flow/state on every startup (not a special "resume mode"). All tasks done → auto push + draft PR (--no-prto skip) - Cross-platform: flowctl is a single Rust binary (macOS/Linux). RP plan-review auto-degrades to Codex on platforms where rp-cli is unavailable. Bash hooks degrade gracefully on Windows (skip, don't block)
- Session start: CLAUDE.md instruction (not an enforced hook) — if
.flow/exists, runflowctl status --interruptedto check for unfinished work from a previous session and resume with the suggested/flow-code:work <id>command - DAG cycle detection:
flowctl dep addvalidates that adding a dependency does not create a cycle in the task dependency graph. If a cycle would be created, the command fails with an error. The DAG is validated using topological sort via thepetgraphcrate - Concurrency-safe state: All read-modify-write operations on shared JSON state files (pipeline, phases, locks) use advisory file locks via
fs2to prevent lost updates under concurrent access (e.g., Ralph daemon) - Worker timeout: Workers have a 30-minute default timeout per task (configurable via
worker.timeout_minutes). On timeout: task marked failed, file locks released, wave continues - Stale lock recovery: Runs at wave start AND on worker completion — detects locks held by done/failed/blocked tasks and releases them to prevent deadlocks
- Worker phase mapping: Workers execute 12 internal phases (via
flowctl worker-phase next/done) within the epic "Work" phase. Epic phases and worker phases are independent systems operating at different levels - Project context: Optional
.flow/project-context.md(template intemplates/project-context.md) provides shared technical standards (stack, rules, architecture decisions, non-goals) that all worker agents read during Phase 2 re-anchoring. Keeps agents aligned on conventions code alone can't convey - Code graph persistent index:
flowctl graph buildconstructs symbol-level + file-level reference graph with forward/reverse edges, persisted to.flow/graph.binvia bincode.graph refsfinds all references (<16ms),graph impacttraces transitive dependents (BFS depth 3). Incremental update viagit diff --name-only - Intent-level API:
flowctl findauto-routes queries (regex→index regex, symbol→graph refs, literal→trigram, fallback→fuzzy).flowctl edittries exact str::replacen then fuzzy fudiff fallback. Reduces 7 raw tools to 4 intent commands for agent clarity - N-gram bincode optimization: Index serialized as bincode (6.2MB→502KB, 12x smaller). Candidate verification via memchr::memmem (2-5x faster). Regex→trigram extraction via regex-syntax for indexed regex search
- ADR enforcement: 10 ADRs in docs/decisions/ with YAML frontmatter (verify + scope). 5 verify commands registered as .flow/invariants.md, checked by
flowctl invariants checkandflowctl guard
ref/— reference/backup repos*.upstream— upstream backup files.tasks/— runtime state__pycache__/— Python cache (hook scripts only).flow/— per-project task state (runtime, not part of plugin)