Skip to content

ducdmdev/agent-team-plugin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

291 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agent Team

Orchestrate parallel AI teammates in Claude Code — 2 pipeline stages (execute, audit) with per-task adversarial review (Executor → Reviewer → Challenger), hook enforcement, and workspace tracking.

Claude Code Plugin Live Demo

What It Does

This plugin adds Agent Team pipeline skills to Claude Code that decompose complex tasks into parallel work streams executed by ephemeral AI teammates across 2 stages.

  • A team lead coordinates the team but never writes code
  • 2 pipeline stages — execute (Phase 0 prep + per-task pipeline) and audit (integration-only verification)
  • Per-task adversarial pipeline — every task spawns an ephemeral Executor → Reviewer → Challenger trio
  • 3 universal roles with archetype-specific extensions (Implementation, Research, Audit, Planning)
  • Hooks enforce discipline: block premature completion, nudge idle teammates, validate workspace
  • Auto-chains to superpowers:writing-plans if no plan exists, then to audit when execution completes
  • A persistent workspace tracks progress, tasks, issues, per-task reviews/challenges, decisions, and generates a final report

See It In Action

Agent Team Demo

Here's what happens when you say: "use agent team to refactor the auth module" (with a pre-existing plan from superpowers:writing-plans):

You > /agent-team:execute docs/plans/0509-refactor-auth.md

━━ Phase 0 — Prep ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  ✓ Plan resolved: docs/plans/0509-refactor-auth.md
  ✓ Archetype: Implementation
  ✓ Decomposed into 3 parallel streams + 1 convergence task

  Team plan:
    #1: Refactor token validation -> Executor (Implementation, plan-mode) -> owns src/auth/token.ts
    #2: Extract session management -> Executor (Implementation, plan-mode) -> owns src/auth/session.ts
    #3: Update middleware -> Executor (Implementation) -> owns src/auth/middleware.ts (blocked by #1, #2)
    #4: Integration test -> Executor (Implementation) -> owns test/integration/

  Per-task pipeline (default): each task runs Executor → Reviewer → Challenger.
  Critical path: #1 → #3 → #4 (length 3)
  Convergence points: #3 (#1 + #2)

  Workspace: .agent-team/0509-refactor-auth/
  Approve?

You > approve

━━ Phase 4 — Per-Task Pipelines (parallel across tasks #1 + #2) ━━

  Task #1 pipeline:
    spawn executor-task-1 (plan-mode)
    PLAN_PROPOSAL #1: Adapter pattern for token validation...
    PLAN_APPROVED #1
    STARTING #1 → COMPLETED #1 (3 files changed)
    → spawn reviewer-task-1
      Impact analysis: 4 call sites + 3 importers identified
      CODE_REVIEW #1: approve, 1 minor finding logged
    → spawn challenger-task-1
      Rules checked: 8/8 pass
      CHALLENGE_REVIEW #1: approve
    Task #1 complete.

  [Task #2 ran similar pipeline in parallel]

  Task #3 unblocked. Spawn pipeline...
    CHALLENGE_REVIEW #3: approve
  Task #4 spawn pipeline...
    CHALLENGE_REVIEW #4: approve

  exec-reviewer: EXECUTE_REVIEW: status=ready_for_audit, 4/4 tasks done

  → auto-chain to audit

━━ Phase 5 — Audit (integration-only) ━━━━━━━━━━━━━━━━━━━━━━━━━━

  reviewer:          Integration review of impact_files overlap (#1+#3, #2+#3):
                       ✓ Interface compatibility    ✓ Integrated build
                       ✓ Integrated tests           ✓ No regressions

  elegance-reviewer: ELEGANCE_REVIEW: overall_score=4.2

  audit-reviewer:    AUDIT_REVIEW: status=approved

  Report: .agent-team/0509-refactor-auth/report.md
  Total: 6 files changed, 4/4 tasks completed, 0 open issues, elegance score 4.2/5

The workspace persists at .agent-team/0509-refactor-auth/ with the full audit trail: tasks, issues, decisions, per-task reviews/task-{id}-review.md and task-{id}-challenge.md documents, lessons, and final report.

Prerequisites

Requirement Details
Claude Code CLI With plugin support
Feature flag CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in shell env or settings.json
jq Required by hook scripts. Hooks skip gracefully if missing
git Optional — used for file change detection

Installation

From Marketplace

First, add the marketplace:

claude plugin marketplace add ducdmdev/agent-team-plugin

Then install:

claude plugin install agent-team

Local Development

claude --plugin-dir /path/to/agent-team-plugin

Usage

Trigger the skill with phrases like:

> create a team to refactor the auth module
> work in parallel on the API endpoints and frontend components
> use agent team to build the new dashboard feature
> spawn teammates to review the PR from security, performance, and correctness angles

The skill activates when your task has 2+ independent work streams. If the task is better handled sequentially, the lead will tell you.

Pipeline Commands

Command When to Use Example
/agent-team:execute [plan-path] Pipeline entry point — Phase 0 prep + per-task pipeline + auto-chain to audit. If no plan, auto-chains to superpowers:writing-plans first. "use agent team to refactor auth" or "/agent-team:execute docs/plans/auth.md"
/agent-team:audit [workspace] Re-run integration-only verification on completed work "audit the team output"

How It Works

execute (Phase 0: resolve plan + decompose + approve)
   → execute (Phase 4: per-task Executor → Reviewer → Challenger pipeline)
   → audit (integration-only verification + final report)

The pipeline is two skills:

Stage What Happens
Execute Phase 0: resolve plan (or auto-chain to writing-plans), load prior context, detect archetype, decompose into parallel streams, mark plan-mode, gate on user approval. Phase 3: create team + init workspace. Phase 4: per-task adversarial pipeline (Executor → Reviewer → Challenger, ephemeral per task).
Audit Phase 5: completion gates per archetype + integration-only review (cross-task impact_files overlap) + elegance review + lessons capture + final report. Per-task within-task adversarial review is already done by per-task Challengers.

Each task spawns a 3-stage ephemeral pipeline:

  • Executor writes the artifact (code / findings / audit / design — per archetype)
  • Reviewer validates the artifact + performs grep-based impact analysis (call sites + importers → impact_files in task-graph.json)
  • Challenger deep-dives looking for what Executor and Reviewer missed (rules / compliance / edge cases)

Active agent count at any moment ≤ in-flight tasks. Each role shuts down before the next role spawns within a task. Per-task review docs persist at .agent-team/{team}/reviews/task-{id}-review.md and task-{id}-challenge.md.

Each stage has an inter-stage review agent (execute-reviewer, audit-reviewer) that validates output before the next stage begins.

Plan-aware: Phase 0 of execute resolves the plan from $ARGUMENTS, scans docs/plans/*.md, or auto-invokes superpowers:writing-plans if no plan is found. The team decomposition derives from the approved plan.

Roles

The team uses 3 universal roles plus a Lead orchestrator. Each role's behavior adapts based on the task's archetype (Implementation, Research, Audit, Planning).

Role Spawned Tools Per-task lifetime Archetype focus
Lead Once (your session) Coordination only Whole team Phase 0 prep + Phase 4 orchestration
Executor Per task Full general-purpose Until COMPLETED Implementation: code · Research: findings · Audit: report · Planning: design
Reviewer Per task Read + Grep + read-only Bash Until CODE_REVIEW Validates Executor output + grep-based impact analysis
Challenger Per task Read + Grep + read-only Bash Until CHALLENGE_REVIEW Adversarial deep-dive + rules/compliance

Legacy role names (Implementer, Researcher, Migrator, Tester, Auditor, Planner, etc.) survive as archetype specializations — they're colloquial labels for "Executor with archetype X + role-specific directives". See docs/teammate-roles.md for the full mapping table.

Team Archetypes

Phase 0.3 auto-detects the archetype from the plan + task description and uses it to configure Reviewer/Challenger focus and plan-mode defaults:

Archetype Trigger phrases Reviewer focus Challenger focus
Implementation "implement", "build", "refactor", "fix", "migrate" Tests, impact analysis, build/lint Edge cases, security, performance, coupling
Research "research", "investigate", "explore", "compare" Source citations, completeness, bias Missed sources, counter-evidence, scope gaps
Audit "audit", "review", "assess", "evaluate" Coverage, evidence, severity Skipped items, loopholes, severity drift
Planning "plan", "design", "architect", "spec" Structure, specificity, consistency Unconsidered alternatives, hidden trade-offs
Hybrid combines 2+ above Per dominant archetype + secondary Per dominant archetype + secondary

You can override the auto-detected archetype during the Phase 0.6 approval gate.

Communication Protocol

Teammates use structured messages for clean coordination:

STARTING #N:   what I plan to do, which files I'll touch
COMPLETED #N:  what I did, files changed, any concerns
BLOCKED #N:    severity={level}, what's blocking, impact
HANDOFF #N:    what I produced that another teammate needs
QUESTION:      what I need to know

Optional extended messages for long-running tasks:

PROGRESS #N:   milestone={desc}, percent={0-100}, eta={minutes}
CHECKPOINT #N: intermediate results, artifacts, ready_for=[task IDs]

Hooks

Thirteen hooks enforce team discipline and provide DAG-aware coordination:

TaskCompleted

Blocks premature task completion by checking:

  • Workspace exists with all tracking files (progress.md, tasks.md, issues.md)
  • Implementation tasks have actual file changes (via git status)
  • Supports scoped checks using task_id and teammate_name

TeammateIdle

Nudges idle teammates that still have in-progress tasks:

  • Counts assigned in-progress tasks
  • Loop protection: allows idle after 3 consecutive blocks (teammate is genuinely stuck)

SessionStart (compact)

Auto-recovers workspace context after context compaction:

  • Detects active workspaces and injects recovery context
  • Skips completed workspaces (status: done)

PreToolUse (Write|Edit)

Enforces file ownership boundaries:

  • Reads file-locks.json from the workspace to determine ownership
  • First violation: warns (exit 0). Second violation: blocks (exit 2)
  • Workspace files are always allowed regardless of ownership

ValidateTaskGraph (SubagentStart)

Validates task-graph.json schema and detects circular dependencies before each teammate is spawned:

  • Checks: valid JSON, nodes have required fields, dependency references resolve, no cycles
  • Blocks teammate spawn if task-graph is invalid or has circular dependencies
  • Gracefully allows spawn if task-graph doesn't exist yet (workspace may still be initializing)

WorkspaceCompleteness (SubagentStart)

Validates all 4 tracking files and required fields before teammate spawn:

  • Checks progress.md, tasks.md, issues.md, task-graph.json exist in the workspace
  • Validates that progress.md contains required fields (Archetype, Pipeline status)
  • Blocks teammate spawn if workspace is incomplete or missing required metadata

PlanRevisionLimit (PreToolUse(SendMessage))

Enforces max 2 plan-mode revision rounds per teammate:

  • Counts PLAN_REVISION messages sent to each teammate
  • Blocks the third PLAN_REVISION with guidance to approve or reassign
  • Prevents infinite plan-mode loops that waste context and time

PreShutdownCommit (PreToolUse(TeamDelete))

Blocks TeamDelete if any owned files have uncommitted changes:

  • Reads file-locks.json to determine which files each teammate owns
  • Checks git status for uncommitted changes in owned files
  • Blocks team deletion until all owned files are committed or explicitly abandoned

SubagentStart / SubagentStop

Tracks teammate lifecycle in events.log:

  • Logs spawn and stop events with timestamps and teammate metadata
  • Provides post-mortem analysis data

ComputeCriticalPath (TaskCompleted)

Recomputes and displays the critical path after each task completion:

  • Reads task-graph.json for the dependency graph
  • Outputs remaining critical path and identifies blocked critical tasks
  • Informational only — always allows task completion

DetectResume (SessionStart)

Detects resumable workspaces at session start:

  • Scans for incomplete task-graph.json files in .agent-team/
  • Validates completed task output files via git timestamps (valid/stale/missing)
  • Outputs resume context with options to resume or start fresh

CheckIntegrationPoint (TaskCompleted)

Detects when convergence points become fully unblocked:

  • Checks if all upstream tasks of a convergence point are completed
  • Nudges the lead to verify interface compatibility before downstream task starts
  • Informational only — silent when no convergence point is ready

All hooks degrade gracefully — exit 0 if jq is missing.

Workspace

Each team creates a persistent workspace at .agent-team/{team-name}/ in your project, where {team-name} uses an MMDD- date prefix for uniqueness (e.g., 0304-refactor-auth):

.agent-team/0509-refactor-auth/
├── progress.md      # Team status, members, decisions, handoffs
├── tasks.md         # Task ledger with status tracking
├── issues.md        # Issue tracker with severity and resolution
├── file-locks.json  # File ownership map (teammate -> files/directories)
├── task-graph.json  # DAG: task dependencies, impact_files, review_status, challenge_status
├── events.log       # Structured JSON event log for post-mortem analysis
├── reviews/         # Per-task review and challenge documents (v4.0)
│   ├── task-1-review.md      # Reviewer output per task
│   └── task-1-challenge.md   # Challenger output per task
└── report.md        # Final report (generated at completion)
  • Persists after team deletion — it's the permanent record
  • Shared — all teammates can read for context
  • Gitignored — coordination artifacts, not deliverables. Automatically added to .gitignore during Phase 3 workspace setup if not already excluded.

Plugin Structure

agent-team-plugin/
├── .claude-plugin/
│   ├── plugin.json              # Plugin metadata
│   └── marketplace.json         # Marketplace registry
├── hooks/
│   └── hooks.json               # Hook definitions (${CLAUDE_PLUGIN_ROOT} paths)
├── scripts/
│   ├── verify-task-complete.sh      # TaskCompleted hook
│   ├── check-teammate-idle.sh       # TeammateIdle hook
│   ├── recover-context.sh           # SessionStart(compact) hook
│   ├── check-file-ownership.sh      # PreToolUse(Write|Edit) hook
│   ├── validate-task-graph.sh       # ValidateTaskGraph hook (SubagentStart)
│   ├── check-workspace-completeness.sh  # WorkspaceCompleteness hook
│   ├── enforce-plan-revision-limit.sh   # PlanRevisionLimit hook
│   ├── enforce-pre-shutdown-commit.sh   # PreShutdownCommit hook
│   ├── track-teammate-lifecycle.sh  # SubagentStart/Stop hook
│   ├── setup-worktree.sh            # Worktree creation for isolation mode
│   ├── merge-worktrees.sh           # Worktree merge in Phase 5
│   ├── compute-critical-path.sh     # ComputeCriticalPath hook
│   ├── detect-resume.sh             # DetectResume hook
│   ├── check-integration-point.sh   # CheckIntegrationPoint hook
│   ├── record-demo.sh              # Demo recording utility
│   └── generate-demo-cast.sh       # Demo asciicast generator
├── skills/
│   ├── execute/
│   │   ├── SKILL.md             # Pipeline entry point — Phase 0 prep + Phase 4 per-task pipeline
│   │   ├── references/          # Communication protocol, coordination patterns, error recovery,
│   │   │                        # prior-context loading, plan-mode protocol
│   │   ├── examples/            # Plan proposal example
│   │   └── agents/              # Spawn templates (Executor / Reviewer / Challenger), execute-reviewer
│   └── audit/
│       ├── SKILL.md             # Audit stage — gates + integration-only review + lessons + report
│       ├── references/          # Completion gates, elegance rubric, report format
│       ├── examples/            # Lessons-learned examples
│       └── agents/              # Reviewer (integration-only), elegance-reviewer, audit-reviewer
├── docs/
│   ├── teammate-roles.md          # Role definitions and selection guide
│   ├── workspace-templates.md     # Workspace file templates
│   ├── team-archetypes.md         # Team type definitions and phase profiles
│   └── custom-roles.md            # Template for project-specific roles
├── tests/
│   ├── run-tests.sh               # Test runner (16 test files)
│   ├── lib/
│   │   └── test-helpers.sh        # Shared test utilities
│   ├── hooks/                     # Hook-specific tests
│   └── structure/                 # Plugin structure validation tests
├── CLAUDE.md
└── README.md

Troubleshooting

Agent Teams not available

Error: TeamCreate tool is not available

Set the feature flag:

export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1

Or in Claude Code settings.json:

{
  "env": {
    "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
  }
}

jq not installed

Hooks skip checks silently without jq. Install for full enforcement:

brew install jq          # macOS
sudo apt install jq      # Ubuntu/Debian
scoop install jq         # Windows

Hooks not firing

  1. Verify installed: claude plugin list
  2. Check hooks/hooks.json exists
  3. Ensure scripts are executable: chmod +x scripts/*.sh

Team size limits

  • Max 4 for mixed teams (implementers + reviewers)
  • Up to 6 if extras are read-only (researchers, reviewers)
  • Break larger tasks into sequential phases

For teams larger than 4, verify: (1) every stream has zero file overlap, (2) cross-communication is minimal, (3) workspace churn is manageable.

Changelog

See CHANGELOG.md for a detailed version history.

About

Orchestrate parallel AI teammates in Claude Code — 2-skill pipeline (execute, audit) with per-task adversarial review (Executor → Reviewer → Challenger ephemeral), hook enforcement, and workspace tracking

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors