Orchestrate parallel AI teammates in Claude Code — 2 pipeline stages (execute, audit) with per-task adversarial review (Executor → Reviewer → Challenger), hook enforcement, and workspace tracking.
This plugin adds Agent Team pipeline skills to Claude Code that decompose complex tasks into parallel work streams executed by ephemeral AI teammates across 2 stages.
- A team lead coordinates the team but never writes code
- 2 pipeline stages — execute (Phase 0 prep + per-task pipeline) and audit (integration-only verification)
- Per-task adversarial pipeline — every task spawns an ephemeral Executor → Reviewer → Challenger trio
- 3 universal roles with archetype-specific extensions (Implementation, Research, Audit, Planning)
- Hooks enforce discipline: block premature completion, nudge idle teammates, validate workspace
- Auto-chains to
superpowers:writing-plansif no plan exists, then to audit when execution completes - A persistent workspace tracks progress, tasks, issues, per-task reviews/challenges, decisions, and generates a final report
Here's what happens when you say: "use agent team to refactor the auth module" (with a pre-existing plan from superpowers:writing-plans):
You > /agent-team:execute docs/plans/0509-refactor-auth.md
━━ Phase 0 — Prep ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✓ Plan resolved: docs/plans/0509-refactor-auth.md
✓ Archetype: Implementation
✓ Decomposed into 3 parallel streams + 1 convergence task
Team plan:
#1: Refactor token validation -> Executor (Implementation, plan-mode) -> owns src/auth/token.ts
#2: Extract session management -> Executor (Implementation, plan-mode) -> owns src/auth/session.ts
#3: Update middleware -> Executor (Implementation) -> owns src/auth/middleware.ts (blocked by #1, #2)
#4: Integration test -> Executor (Implementation) -> owns test/integration/
Per-task pipeline (default): each task runs Executor → Reviewer → Challenger.
Critical path: #1 → #3 → #4 (length 3)
Convergence points: #3 (#1 + #2)
Workspace: .agent-team/0509-refactor-auth/
Approve?
You > approve
━━ Phase 4 — Per-Task Pipelines (parallel across tasks #1 + #2) ━━
Task #1 pipeline:
spawn executor-task-1 (plan-mode)
PLAN_PROPOSAL #1: Adapter pattern for token validation...
PLAN_APPROVED #1
STARTING #1 → COMPLETED #1 (3 files changed)
→ spawn reviewer-task-1
Impact analysis: 4 call sites + 3 importers identified
CODE_REVIEW #1: approve, 1 minor finding logged
→ spawn challenger-task-1
Rules checked: 8/8 pass
CHALLENGE_REVIEW #1: approve
Task #1 complete.
[Task #2 ran similar pipeline in parallel]
Task #3 unblocked. Spawn pipeline...
CHALLENGE_REVIEW #3: approve
Task #4 spawn pipeline...
CHALLENGE_REVIEW #4: approve
exec-reviewer: EXECUTE_REVIEW: status=ready_for_audit, 4/4 tasks done
→ auto-chain to audit
━━ Phase 5 — Audit (integration-only) ━━━━━━━━━━━━━━━━━━━━━━━━━━
reviewer: Integration review of impact_files overlap (#1+#3, #2+#3):
✓ Interface compatibility ✓ Integrated build
✓ Integrated tests ✓ No regressions
elegance-reviewer: ELEGANCE_REVIEW: overall_score=4.2
audit-reviewer: AUDIT_REVIEW: status=approved
Report: .agent-team/0509-refactor-auth/report.md
Total: 6 files changed, 4/4 tasks completed, 0 open issues, elegance score 4.2/5
The workspace persists at .agent-team/0509-refactor-auth/ with the full audit trail: tasks, issues, decisions, per-task reviews/task-{id}-review.md and task-{id}-challenge.md documents, lessons, and final report.
| Requirement | Details |
|---|---|
| Claude Code CLI | With plugin support |
| Feature flag | CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in shell env or settings.json |
jq |
Required by hook scripts. Hooks skip gracefully if missing |
git |
Optional — used for file change detection |
First, add the marketplace:
claude plugin marketplace add ducdmdev/agent-team-pluginThen install:
claude plugin install agent-teamclaude --plugin-dir /path/to/agent-team-pluginTrigger the skill with phrases like:
> create a team to refactor the auth module
> work in parallel on the API endpoints and frontend components
> use agent team to build the new dashboard feature
> spawn teammates to review the PR from security, performance, and correctness angles
The skill activates when your task has 2+ independent work streams. If the task is better handled sequentially, the lead will tell you.
| Command | When to Use | Example |
|---|---|---|
/agent-team:execute [plan-path] |
Pipeline entry point — Phase 0 prep + per-task pipeline + auto-chain to audit. If no plan, auto-chains to superpowers:writing-plans first. |
"use agent team to refactor auth" or "/agent-team:execute docs/plans/auth.md" |
/agent-team:audit [workspace] |
Re-run integration-only verification on completed work | "audit the team output" |
execute (Phase 0: resolve plan + decompose + approve)
→ execute (Phase 4: per-task Executor → Reviewer → Challenger pipeline)
→ audit (integration-only verification + final report)
The pipeline is two skills:
| Stage | What Happens |
|---|---|
| Execute | Phase 0: resolve plan (or auto-chain to writing-plans), load prior context, detect archetype, decompose into parallel streams, mark plan-mode, gate on user approval. Phase 3: create team + init workspace. Phase 4: per-task adversarial pipeline (Executor → Reviewer → Challenger, ephemeral per task). |
| Audit | Phase 5: completion gates per archetype + integration-only review (cross-task impact_files overlap) + elegance review + lessons capture + final report. Per-task within-task adversarial review is already done by per-task Challengers. |
Each task spawns a 3-stage ephemeral pipeline:
- Executor writes the artifact (code / findings / audit / design — per archetype)
- Reviewer validates the artifact + performs grep-based impact analysis (call sites + importers →
impact_filesin task-graph.json) - Challenger deep-dives looking for what Executor and Reviewer missed (rules / compliance / edge cases)
Active agent count at any moment ≤ in-flight tasks. Each role shuts down before the next role spawns within a task. Per-task review docs persist at .agent-team/{team}/reviews/task-{id}-review.md and task-{id}-challenge.md.
Each stage has an inter-stage review agent (execute-reviewer, audit-reviewer) that validates output before the next stage begins.
Plan-aware: Phase 0 of execute resolves the plan from $ARGUMENTS, scans docs/plans/*.md, or auto-invokes superpowers:writing-plans if no plan is found. The team decomposition derives from the approved plan.
The team uses 3 universal roles plus a Lead orchestrator. Each role's behavior adapts based on the task's archetype (Implementation, Research, Audit, Planning).
| Role | Spawned | Tools | Per-task lifetime | Archetype focus |
|---|---|---|---|---|
| Lead | Once (your session) | Coordination only | Whole team | Phase 0 prep + Phase 4 orchestration |
| Executor | Per task | Full general-purpose | Until COMPLETED | Implementation: code · Research: findings · Audit: report · Planning: design |
| Reviewer | Per task | Read + Grep + read-only Bash | Until CODE_REVIEW | Validates Executor output + grep-based impact analysis |
| Challenger | Per task | Read + Grep + read-only Bash | Until CHALLENGE_REVIEW | Adversarial deep-dive + rules/compliance |
Legacy role names (Implementer, Researcher, Migrator, Tester, Auditor, Planner, etc.) survive as archetype specializations — they're colloquial labels for "Executor with archetype X + role-specific directives". See docs/teammate-roles.md for the full mapping table.
Phase 0.3 auto-detects the archetype from the plan + task description and uses it to configure Reviewer/Challenger focus and plan-mode defaults:
| Archetype | Trigger phrases | Reviewer focus | Challenger focus |
|---|---|---|---|
| Implementation | "implement", "build", "refactor", "fix", "migrate" | Tests, impact analysis, build/lint | Edge cases, security, performance, coupling |
| Research | "research", "investigate", "explore", "compare" | Source citations, completeness, bias | Missed sources, counter-evidence, scope gaps |
| Audit | "audit", "review", "assess", "evaluate" | Coverage, evidence, severity | Skipped items, loopholes, severity drift |
| Planning | "plan", "design", "architect", "spec" | Structure, specificity, consistency | Unconsidered alternatives, hidden trade-offs |
| Hybrid | combines 2+ above | Per dominant archetype + secondary | Per dominant archetype + secondary |
You can override the auto-detected archetype during the Phase 0.6 approval gate.
Teammates use structured messages for clean coordination:
STARTING #N: what I plan to do, which files I'll touch
COMPLETED #N: what I did, files changed, any concerns
BLOCKED #N: severity={level}, what's blocking, impact
HANDOFF #N: what I produced that another teammate needs
QUESTION: what I need to know
Optional extended messages for long-running tasks:
PROGRESS #N: milestone={desc}, percent={0-100}, eta={minutes}
CHECKPOINT #N: intermediate results, artifacts, ready_for=[task IDs]
Thirteen hooks enforce team discipline and provide DAG-aware coordination:
Blocks premature task completion by checking:
- Workspace exists with all tracking files (
progress.md,tasks.md,issues.md) - Implementation tasks have actual file changes (via
git status) - Supports scoped checks using
task_idandteammate_name
Nudges idle teammates that still have in-progress tasks:
- Counts assigned in-progress tasks
- Loop protection: allows idle after 3 consecutive blocks (teammate is genuinely stuck)
Auto-recovers workspace context after context compaction:
- Detects active workspaces and injects recovery context
- Skips completed workspaces (status: done)
Enforces file ownership boundaries:
- Reads
file-locks.jsonfrom the workspace to determine ownership - First violation: warns (exit 0). Second violation: blocks (exit 2)
- Workspace files are always allowed regardless of ownership
Validates task-graph.json schema and detects circular dependencies before each teammate is spawned:
- Checks: valid JSON, nodes have required fields, dependency references resolve, no cycles
- Blocks teammate spawn if task-graph is invalid or has circular dependencies
- Gracefully allows spawn if task-graph doesn't exist yet (workspace may still be initializing)
Validates all 4 tracking files and required fields before teammate spawn:
- Checks
progress.md,tasks.md,issues.md,task-graph.jsonexist in the workspace - Validates that
progress.mdcontains required fields (Archetype,Pipeline status) - Blocks teammate spawn if workspace is incomplete or missing required metadata
Enforces max 2 plan-mode revision rounds per teammate:
- Counts
PLAN_REVISIONmessages sent to each teammate - Blocks the third
PLAN_REVISIONwith guidance to approve or reassign - Prevents infinite plan-mode loops that waste context and time
Blocks TeamDelete if any owned files have uncommitted changes:
- Reads
file-locks.jsonto determine which files each teammate owns - Checks
git statusfor uncommitted changes in owned files - Blocks team deletion until all owned files are committed or explicitly abandoned
Tracks teammate lifecycle in events.log:
- Logs spawn and stop events with timestamps and teammate metadata
- Provides post-mortem analysis data
Recomputes and displays the critical path after each task completion:
- Reads
task-graph.jsonfor the dependency graph - Outputs remaining critical path and identifies blocked critical tasks
- Informational only — always allows task completion
Detects resumable workspaces at session start:
- Scans for incomplete
task-graph.jsonfiles in.agent-team/ - Validates completed task output files via git timestamps (valid/stale/missing)
- Outputs resume context with options to resume or start fresh
Detects when convergence points become fully unblocked:
- Checks if all upstream tasks of a convergence point are completed
- Nudges the lead to verify interface compatibility before downstream task starts
- Informational only — silent when no convergence point is ready
All hooks degrade gracefully — exit 0 if jq is missing.
Each team creates a persistent workspace at .agent-team/{team-name}/ in your project, where {team-name} uses an MMDD- date prefix for uniqueness (e.g., 0304-refactor-auth):
.agent-team/0509-refactor-auth/
├── progress.md # Team status, members, decisions, handoffs
├── tasks.md # Task ledger with status tracking
├── issues.md # Issue tracker with severity and resolution
├── file-locks.json # File ownership map (teammate -> files/directories)
├── task-graph.json # DAG: task dependencies, impact_files, review_status, challenge_status
├── events.log # Structured JSON event log for post-mortem analysis
├── reviews/ # Per-task review and challenge documents (v4.0)
│ ├── task-1-review.md # Reviewer output per task
│ └── task-1-challenge.md # Challenger output per task
└── report.md # Final report (generated at completion)
- Persists after team deletion — it's the permanent record
- Shared — all teammates can read for context
- Gitignored — coordination artifacts, not deliverables. Automatically added to
.gitignoreduring Phase 3 workspace setup if not already excluded.
agent-team-plugin/
├── .claude-plugin/
│ ├── plugin.json # Plugin metadata
│ └── marketplace.json # Marketplace registry
├── hooks/
│ └── hooks.json # Hook definitions (${CLAUDE_PLUGIN_ROOT} paths)
├── scripts/
│ ├── verify-task-complete.sh # TaskCompleted hook
│ ├── check-teammate-idle.sh # TeammateIdle hook
│ ├── recover-context.sh # SessionStart(compact) hook
│ ├── check-file-ownership.sh # PreToolUse(Write|Edit) hook
│ ├── validate-task-graph.sh # ValidateTaskGraph hook (SubagentStart)
│ ├── check-workspace-completeness.sh # WorkspaceCompleteness hook
│ ├── enforce-plan-revision-limit.sh # PlanRevisionLimit hook
│ ├── enforce-pre-shutdown-commit.sh # PreShutdownCommit hook
│ ├── track-teammate-lifecycle.sh # SubagentStart/Stop hook
│ ├── setup-worktree.sh # Worktree creation for isolation mode
│ ├── merge-worktrees.sh # Worktree merge in Phase 5
│ ├── compute-critical-path.sh # ComputeCriticalPath hook
│ ├── detect-resume.sh # DetectResume hook
│ ├── check-integration-point.sh # CheckIntegrationPoint hook
│ ├── record-demo.sh # Demo recording utility
│ └── generate-demo-cast.sh # Demo asciicast generator
├── skills/
│ ├── execute/
│ │ ├── SKILL.md # Pipeline entry point — Phase 0 prep + Phase 4 per-task pipeline
│ │ ├── references/ # Communication protocol, coordination patterns, error recovery,
│ │ │ # prior-context loading, plan-mode protocol
│ │ ├── examples/ # Plan proposal example
│ │ └── agents/ # Spawn templates (Executor / Reviewer / Challenger), execute-reviewer
│ └── audit/
│ ├── SKILL.md # Audit stage — gates + integration-only review + lessons + report
│ ├── references/ # Completion gates, elegance rubric, report format
│ ├── examples/ # Lessons-learned examples
│ └── agents/ # Reviewer (integration-only), elegance-reviewer, audit-reviewer
├── docs/
│ ├── teammate-roles.md # Role definitions and selection guide
│ ├── workspace-templates.md # Workspace file templates
│ ├── team-archetypes.md # Team type definitions and phase profiles
│ └── custom-roles.md # Template for project-specific roles
├── tests/
│ ├── run-tests.sh # Test runner (16 test files)
│ ├── lib/
│ │ └── test-helpers.sh # Shared test utilities
│ ├── hooks/ # Hook-specific tests
│ └── structure/ # Plugin structure validation tests
├── CLAUDE.md
└── README.md
Error: TeamCreate tool is not available
Set the feature flag:
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1Or in Claude Code settings.json:
{
"env": {
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
}
}Hooks skip checks silently without jq. Install for full enforcement:
brew install jq # macOS
sudo apt install jq # Ubuntu/Debian
scoop install jq # Windows- Verify installed:
claude plugin list - Check
hooks/hooks.jsonexists - Ensure scripts are executable:
chmod +x scripts/*.sh
- Max 4 for mixed teams (implementers + reviewers)
- Up to 6 if extras are read-only (researchers, reviewers)
- Break larger tasks into sequential phases
For teams larger than 4, verify: (1) every stream has zero file overlap, (2) cross-communication is minimal, (3) workspace churn is manageable.
See CHANGELOG.md for a detailed version history.
