Autonomous Claude Swarms -- Teams Improvement => feat(plugins): swarm-orchestrator — DAG-aware multi-tier coordination with role-typed heads for autonomous agent teams by kushalj1997 · Pull Request #57880 · anthropics/claude-code

kushalj1997 · 2026-05-10T20:15:39Z

Autonomous Claude Swarms and Agent Teams Improvement

This PR is intended to improve the native Agent Teams feature of Claude Code. It started as a self-built project (using Claude Code) to speed up task execution for personal projects. I noticed the native Teams Feature on Sunday, May 10, 2026, and figured this would naturally align with the development. Looking forward to iterating on this with the team, and the time is well appreciated. Leveraged Claude Code to build this project and the entire PR. Claude Code drove the integration with native Teams feature, and I drove the initial swarming DAG + meta-supervisor (multi-headed) featureset, all with Claude Code.

Summary

Claude Code has opened up a world of possibility for me. In this PR, I propose an architecture for autonomous development leveraging the Claude API. It naturally falls alongside the in-beta Teams feature native to Anthropic. Thanks for reviewing!

Adds a new plugin at plugins/swarm-orchestrator/ that layers DAG-aware multi-agent coordination on top of vanilla Teams — additive only, no changes to TaskCreate / TaskUpdate / SendMessage / Team* schemas.

The headline capability: autonomous, session-resistant swarm orchestration, backed by a DAG-aware kanban with atomic task claiming so multiple parallel workers can safely pull from the same task queue without races. When claude-swarm init --persistent is configured, the meta-supervisor daemon polls task lists, dispatches role-typed heads, gates merges, and recovers crashed teammates — driven by filesystem state, not by a live session. Tasks continue to completion even after the Claude CLI exits. Scanner heads file new tasks autonomously; Builders implement them; Mergers integrate the results. A visibility CLI (claude-swarm status, swarm-watch TUI) attaches to the daemon regardless of which session started the work.

What the dashboard looks like

The canonical minimal dashboard (scripts/swarm_dashboard.py) modeled on the native Anthropic Teams agent-list interface:

  swarm · 3/5 done · 12s · ↓ 18.4k tokens · $0.0420

  ○ scanner       Scan codebase for type-hint gaps            done          ↓   4.2k   $0.0095
  ○ builder       Refactor utils.py to add type hints         done          ↓   5.1k   $0.0118
  ● test-runner   Write tests for refactored utils            in_progress   ↓   3.8k   $0.0089
  ○ reviewer      Periodic checkpoint review                  done          ↓   2.1k   $0.0048
  ○ merger        Merge cleanly into main                     blocked       ↓      —        —

  ↑/↓ to inspect · Ctrl-C to exit

One header line — aggregate progress, runtime, tokens, cost. One row per role-typed head — status dot (● cyan in_progress, ○ green done / dim idle / magenta blocked, ✗ red failed), name, current task, status word, per-head tokens, per-head spend.

What ships in this PR (5 deliverables):

claude-swarm CLI + library
- DAG-aware kanban (SQLite WAL + BEGIN IMMEDIATE atomic claim), supervisor with parallel dispatch, 12 subcommands
- 94 tests passing, mypy strict + ruff clean, Apache 2.0
- The goal is to merge this library into Anthropic infrastructure https://github.com/kushalj1997/claude-swarm
- Postgres upgrade planned — current SQLite WAL + BEGIN IMMEDIATE is single-host; next step is a Postgres adapter using SELECT FOR UPDATE SKIP LOCKED for multi-host production scale (same Kanban.claim_one() public API; only the backend swaps)
Session-resistant keepalive daemon
- claude-swarm run --daemon detaches via single-fork + setsid; survives Claude Code exit + claude --resume
- Companion daemon-status / daemon-stop for clean lifecycle
6 role-typed agents + 8 slash commands
- Scanner / Builder / Test-Runner / Reviewer / Merger / Auditor with tool-restriction frontmatter
- /swarm-spawn, /swarm-submit, /swarm-status, /swarm-start, /swarm-stop, /swarm-merge, /swarm-abort, /swarm-test
Live demo + minimal TUI dashboard
- try-swarm.sh runs a 5-task DAG in ~25 s via Haiku with auth probe + consent prompt
- Dashboard mirrors the native Anthropic Teams agent-list (one row per head, status + per-head tokens + cost)
Persistent-agent state schema + global-mind transcript
- claude_swarm/agents.py is the on-disk substrate (~/.claude/teams/<team>/agents/<name>.json) the future native Agent(..., persistent=True) would write to
- Every supervisor dispatch is appended to JSONL — replayable for audit, observable in real time

Pertinent — Agent Tool

Two concrete native-Teams limitations this PR confronts head-on:

SendMessage does not surface in spawned teammates. ToolSearch select:SendMessage returns "no matching deferred tools" inside Agent-spawned subprocesses, even with the experimental flag on. Workaround: a filesystem-RPC fallback (claude_swarm/messaging.py) writes directly to ~/.claude/teams/<name>/inboxes/<recipient>.json, which Anthropic's runtime auto-delivers as a conversation turn. Same on-disk schema, drop-in compatible.
The Agent tool itself is not session-resistant. In-binary Agent spawns die when Claude Code exits and claude --resume cannot reattach them. This PR ships a parallel substrate (the keepalive daemon dispatching claude --print subprocesses in a detached process group) so the work survives — but making the same Agent survive requires a binary refactor (proposed in "What this PR ships, and what's next").

Where it started — a Kanban board

The seed of this project was a DAG-aware kanban board with atomic claim_one() task claiming, sitting underneath a small set of role-typed agents. Every other primitive in this PR — supervisor loop, heads architecture, auto-cascade unblock, abort-marker contract, keepalive daemon — grew outward from that one data structure. Tasks declare blockedBy edges, the supervisor polls unblocked() for the topological frontier, parallel heads pull atomically so two workers never claim the same task. Shipped backend is SQLite WAL + BEGIN IMMEDIATE; a Postgres adapter with SELECT FOR UPDATE SKIP LOCKED is the natural next step for multi-host production scale.

Slash commands reference

The plugin ships 8 slash commands at plugins/swarm-orchestrator/commands/*.md — single markdown files with YAML frontmatter declaring description / argument-hint / allowed-tools, plus the prompt body Claude executes.

Command	Purpose
`/swarm-spawn <goal>`	Decompose a goal into a DAG, propose the plan, dispatch a multi-agent team via native `TeamCreate` + `TaskCreate`
`/swarm-submit <prompt>`	Submit a single free-form task to the keepalive swarm; daemon picks it up and dispatches via `claude --print`
`/swarm-status [name]`	Show daemon liveness + DAG topology + head activity + blockers + abort markers + spend
`/swarm-start`	Launch the keepalive supervisor as a detached daemon (single-fork + setsid); survives CLI exit
`/swarm-stop`	Stop the keepalive daemon (SIGTERM → SIGKILL after timeout); removes PID file
`/swarm-merge [name]`	Run the merge pipeline against completed tasks — rebase, test gate, push. Topo-orders by file overlap
`/swarm-abort <head>`	Drop a `<worktree>/.claude/abort-<head>` marker; the head commits WIP, pushes, and exits cleanly
`/swarm-test [name]`	Demo path — spawn a team via native `TeamCreate` + `TaskCreate` and populate the agent-list view with role-typed heads

`claude-swarm` CLI reference

The plugin's daemon + slash commands wrap the claude-swarm Python CLI (installed automatically into .swarm-venv/ by try-swarm.sh).

Command	Purpose	Key flags
`claude-swarm init`	Create the swarm home directory tree	`--home DIR`
`claude-swarm submit`	File a task to the kanban	`--title --prompt --head --blocked-by --tag --priority`
`claude-swarm list`	List tasks, filtered by status / tag	`--status pending\|in_progress\|done\|failed\|...` `--tag`
`claude-swarm status`	JSON snapshot of kanban counts + supervisor state + cost	`--home`
`claude-swarm unblocked`	Print tasks whose blockers are all done (topological frontier)	`--head`
`claude-swarm heads`	List the 6 built-in role-typed heads + their default models	—
`claude-swarm inbox send / recv`	Write a directed message / drain a teammate's inbox	`--from --to --kind --body` / `--name --drain`
`claude-swarm abort {set,clear,check}`	Manage `<worktree>/.claude/abort-<name>` markers	`--worktree --teammate --reason`
`claude-swarm merge`	Auto-merge pipeline (rebase + test gate + push, file-overlap reject)	`--repo --test-cmd --no-overlap-reject`
`claude-swarm run`	The supervisor loop — claims tasks, dispatches heads	`--conductor stub\|claude` `--demo-delay-s` `--max-parallel` `--global-mind-log` `--daemon` `--max-iterations` `--poll-s`
`claude-swarm daemon-status`	Check daemon liveness (PID + log paths)	`--home`
`claude-swarm daemon-stop`	SIGTERM the daemon; SIGKILL after timeout	`--home --timeout-s`

Dependency on the `claude-swarm` library

This plugin requires the standalone claude-swarm Python library (Apache 2.0, 94 tests, mypy strict + ruff clean, ~3 KLOC) for its functional behavior. The plugin's static surfaces (manifest, agent frontmatter, hook scripts, examples, design doc) are self-contained markdown + Python, but the orchestration runtime — kanban with atomic claim, supervisor loop, keepalive daemon, parallel dispatch, global-mind transcript, persistent-agent state schema, dashboard — all live in claude-swarm. The plugin's slash commands shell out to claude-swarm CLI.

The library installs automatically when the operator runs try-swarm.sh (pip install into a sandboxed .swarm-venv/ inside the plugin directory).

Three integration paths Anthropic can pick from:

Accept the dependency as-is — try-swarm.sh continues to pip install from GitHub @ main. Lowest-touch.
Vendor claude_swarm/ into plugins/swarm-orchestrator/lib/ — ~3 KLOC additional inside claude-code, no external dep. Happy to prepare the follow-up PR.
Publish claude-swarm to PyPI under Anthropic's namespace (or accept a transfer) — install becomes pip install claude-swarm with formal versioning.

Try it

One command (no prerequisites beyond the claude CLI on PATH):

bash plugins/swarm-orchestrator/scripts/try-swarm.sh

The script:

Creates .swarm-venv/ inside the plugin directory
pip installs claude-swarm + rich (force-reinstall on every run, so a stale venv from a previous demo never silently uses old code)
Probes auth with a tiny Haiku ping. On failure, prints two clear next-step paths (interactive login via Pro/Max/Team plan, or ANTHROPIC_API_KEY env var) and exits without dispatching anything
Asks for explicit yes consent before any LLM dispatch
Submits a 5-task DAG (Scanner → Builder → {Reviewer, Test-Runner} → Merger), runs the supervisor with --max-parallel 3, launches the dashboard
Auto-exits when all tasks finish (~15–25 s with the default Haiku conductor). Cleanup trap covers EXIT INT TERM — no orphan subprocesses on Ctrl + C

Two non-default modes:

bash plugins/swarm-orchestrator/scripts/try-swarm.sh --stub        # ~20s, no LLM dispatch — free smoke test
bash plugins/swarm-orchestrator/scripts/try-swarm.sh --keepalive   # supervisor runs as a detached daemon

How to review this PR

The diff is partitioned cleanly. Suggested order:

plugins/swarm-orchestrator/plugin.json + .claude-plugin/marketplace.json — surface area
plugins/swarm-orchestrator/agents/*.md (6 files) — role-typed heads with tool allowlists
plugins/swarm-orchestrator/commands/*.md (8 files) — operator entrypoints
plugins/swarm-orchestrator/hooks/{on_task_complete,reviewer_checkpoint}.py — DAG cascade + checkpoint
plugins/swarm-orchestrator/scripts/{try-swarm.sh,swarm_dashboard.py} — demo + dashboard
plugins/swarm-orchestrator/tests/ — 25 / 25 tests (15 hook + 10 scenario), no LLM dispatch required
plugins/swarm-orchestrator/IMPROVEMENTS_OVER_VANILLA_TEAMS.md — design proposal + roadmap

Native Teams limitations addressed (the motivation)

The plugin treats current native Anthropic Teams as the substrate (Agent, TeamCreate, TaskCreate, SendMessage, TaskUpdate, idle/wake) and layers DAG-aware orchestration on top. Every gap below is addressed by shipped code in this PR — nothing aspirational. The full design doc + items deferred to follow-up PRs is in plugins/swarm-orchestrator/IMPROVEMENTS_OVER_VANILLA_TEAMS.md.

#	Limitation	This PR's response
1	`SendMessage` doesn't surface in spawned teammates (`ToolSearch select:SendMessage` returns nothing inside Agent-spawned subprocesses)	Filesystem-RPC fallback (`claude_swarm/messaging.py`) writes directly to `~/.claude/teams/<name>/inboxes/<recipient>.json` — same on-disk schema Anthropic's auto-delivery already reads
2	Native `Agent` spawns die on CLI exit; `claude --resume` can't reattach them	Keepalive supervisor daemon (`--daemon` flag, single-fork + setsid). Workers dispatched as `claude --print` subprocesses in a different process group; the work survives even though the in-binary `Agent` doesn't
3	No DAG iterator — `blockedBy` is settable but no `unblocked()` topological iterator or auto-cascade	`Kanban.unblocked()` topological iterator + `PostToolUse(TaskUpdate)` hook auto-cascades on `status=completed`
4	One generic worker agent type, no role-typed heads	6 role-typed heads (Scanner / Builder / Test-Runner / Reviewer / Merger / Auditor) with tool-restriction frontmatter
5	No abort contract — interrupting an `Agent` loses WIP	`AbortMarker` (`claude_swarm/abort.py`) — drop `<worktree>/.claude/abort-<head>`, the head commits WIP + exits cleanly. Verified end-to-end during PR development.
6	No bounded inbox — a stuck recipient grows the JSON inbox without limit	Bounded `Inbox` (`claude_swarm/messaging.py`) caps at `max_messages` (default 1000); drop-oldest with logged warning
7	No atomic file writes — Anthropic's task / inbox writes risk torn state on crash	tmp-file + `os.replace` (`claude_swarm/messaging.py` `_atomic_write`) — applied to every task / inbox / status write
8	No atomic task claim — two parallel workers could double-claim	`Kanban.claim_one()` uses SQLite `BEGIN IMMEDIATE`; tested under simulated concurrency
9	No conductor timeouts — a stuck LLM call hangs the supervisor indefinitely	600 s default per-dispatch timeout (`claude_swarm/conductor.py`); merge-pipeline test commands at 900 s
10	No worktree garbage collection — branches merged into main leave orphan worktrees	`WorktreeManager.gc_stale()` (`claude_swarm/worktree.py`) — removes worktrees whose branch is merged upstream
11	No replayable record of supervisor decisions — state is ephemeral, lost on exit	`--global-mind-log` flag emits JSONL per supervisor step (turn, task_id, head, status, elapsed_s, cost_so_far_usd)
12	No per-head cost / token accounting in the team-list UI	Dashboard renders per-head `↓ tokens` / `$cost` columns from `spend_by_head` / `tokens_by_head`
13	No filesystem-backed agent registry — `claude --resume` doesn't restore prior agents	`claude_swarm.agents.AgentState` schema + read/write helpers at `~/.claude/teams/<team>/agents/<name>.json` — the substrate a future native `Agent(..., persistent=True)` would write to

Authentication — how the demo gets credentials

The demo dispatches via claude --print and leverages whatever auth the operator's claude binary already uses. It does NOT need its own API key plumbing. The script verifies this with a tiny Haiku probe before doing anything else; failure prints two concrete next-steps and exits cleanly.

The claude binary resolves auth in this priority order:

#	Source	Typical user
1	`ANTHROPIC_API_KEY` env var	Developer / CI / no Max plan
2	`apiKeyHelper` via `--settings`	Org-managed credential rotation
3	OAuth tokens in macOS Keychain (service: `Claude Code-credentials`)	Pro / Max / Team plan after `claude` interactive login

Nothing is stored in the plugin tree. The plugin is keyless.

Plugin surface

DAG-aware kanban — atomic claim_one() for race-free task claiming under N parallel workers. unblocked() topological iterator, add_blocked_by / add_blocks mutation, full status timeline, auto-unblock cascade via PostToolUse(TaskUpdate) hook
8 slash commands + 6 role-typed heads with role-specific tool allowlists
2 lifecycle hooks: PostToolUse(TaskUpdate) for DAG cascade dispatch, Stop for periodic reviewer checkpoints
Abort-marker contract at <worktree>/.claude/abort-<head-name> for graceful interrupt
Testing substrate: 10 binding-agnostic scenarios + 15 hook unit tests, 25 / 25 pass via stdlib unittest (no third-party deps)
Marketplace registration in .claude-plugin/marketplace.json + the plugins/README.md table

Self-healing properties

Designed around the assumption that any individual teammate may die mid-task. Below is the honest split between shipped + deferred.

Shipped in this PR:

Abort-marker contract (claude_swarm/abort.py) — verified end-to-end during PR development
Session-resistant supervisor daemon (--daemon flag) — single-fork + setsid + stdio redirect; survives CLI exit + claude --resume + terminal close
Bounded inbox queues (claude_swarm/messaging.py) — drop-oldest with logged warning; no runaway memory
Atomic file writes — tmp + os.replace everywhere
Worktree garbage collection — removes branches merged into upstream; safe by default
Conductor timeouts — 600 s LLM dispatch, 900 s merge-pipeline tests
Filesystem-RPC fallback for SendMessage — works when the native tool doesn't surface
Atomic task claim — SQLite BEGIN IMMEDIATE prevents double-claim under concurrency

Designed but deferred to a follow-up PR:

Multi-host meta-supervisor with auto-respawn for crashed teammates
Stuck-task watchdog (in_progress > 30 min → re-dispatch) — depends on multi-host coordinator
Pattern-detection classifier — log every supervisor decision; offline-train on task_description → parallelism_safety. Logging is shipped (global-mind transcript); classifier is future work
Multi-team-per-session — requires TeamCreate schema extension on Anthropic's side

Session-resistance — the keepalive daemon

claude-swarm run --daemon detaches the supervisor: single-fork + setsid + stdio redirect to log file. Parent prints PID + log path then exits; child keeps polling the kanban and dispatching workers regardless of what happens to the shell, the Claude Code session, or the terminal.

# Start the daemon
claude-swarm run --home ~/.claude/swarm --daemon --conductor claude \
    --global-mind-log ~/.claude/swarm/global-mind.jsonl

# Submit a task at any time
claude-swarm submit --home ~/.claude/swarm --head builder \
    --title "audit imports" --prompt "Audit ./src for unused imports."

# Inspect daemon liveness
claude-swarm daemon-status --home ~/.claude/swarm    # exit 0 alive, 1 dead

# Stop cleanly
claude-swarm daemon-stop --home ~/.claude/swarm      # SIGTERM, SIGKILL after 5s

Bridging native Teams agents to the keepalive daemon. A native Agent (spawned by the binary's Agent tool) can register long-running work with the daemon and exit cleanly — without doing the work itself:

# Inside the native agent's prompt (Bash):
task_id = subprocess.check_output([
    "claude-swarm", "submit",
    "--home", "~/.claude/swarm",
    "--head", "builder",
    "--prompt", "<long-running prompt>",
]).decode().strip()
# Agent exits; daemon picks up the task and runs it via `claude --print`.

The native agent is the front-end (interactive, in your CLI, dies with the binary); the daemon-dispatched worker is the back-end (session-resistant, runs detached). They share the kanban + inbox as the contract.

Operator-facing test plan — session-resistance proof

# 1. Open a fresh Claude Code session in any project
claude
# 2. In the session, start the keepalive daemon
/swarm-start
# 3. Submit a small task (sleeps 30s, writes a proof file)
/swarm-submit "sleep 30 && echo 'still alive after $(date)' > /tmp/swarm-keepalive-proof.txt" --head builder
# 4. EXIT Claude Code
/exit
# 5. Run claude --resume to restore the session
claude --resume
# 6. Verify daemon survived
/swarm-status        # should still show alive with same PID
# 7. Confirm the proof file was written while you were out
cat /tmp/swarm-keepalive-proof.txt
# 8. Stop the daemon
/swarm-stop

Global mind — the swarm's collective transcript

Every supervisor dispatch is appended to <home>/global-mind.jsonl (controlled via claude-swarm run --global-mind-log <path>). One line per event:

{"ts": 1778447590.33, "turn": 1, "task_id": "0019e1...", "head": "scanner", "title": "Scan codebase", "status": "done", "elapsed_s": 0.024, "cost_so_far_usd": 0.0}
{"ts": 1778447590.39, "turn": 2, "task_id": "0019e1...", "head": "builder", "title": "Refactor utils.py", "status": "done", "elapsed_s": 0.060, "cost_so_far_usd": 0.0}

The transcript is replayable (cat global-mind.jsonl | jq .), streamable (tail -f is a live feed any dashboard can subscribe to), and observable (cost accounting, latency tracking, per-head spend rollups all derive from it).

Architecture stack

Layer	Choice	Why
CLI framework	`click`	Same as `pip`, `flask`, `mkdocs` — familiar conventions
TUI / live dashboard	`rich`	`Live` renderer + `Text` styling
Web / RPC (optional persistent mode)	`FastAPI`	Python-native, no extra runtime; for downstream multi-host meta-supervisors with WebSocket pub/sub
Concurrency	stdlib `asyncio` + `subprocess`	Supervisor is single-writer by design
Task storage (library)	`sqlite3` with WAL + `BEGIN IMMEDIATE`	Atomic claim without external services. Postgres adapter (`SELECT FOR UPDATE SKIP LOCKED`) is the planned next step.
Type checking / lint	`mypy --strict` + `ruff`	Clean across library + plugin
Tests	`pytest` (library, 94 tests) + stdlib `unittest` (plugin, 25 tests)	Both green

No deep-learning frameworks, no databases beyond optional SQLite/Postgres, no message brokers in the default install.

What this PR ships, and what's next

The 5 deliverables are summarized at the top of this PR. Here's the honest follow-up roadmap:

Designed but deferred to a follow-up PR (not shipped here):

Multi-host meta-supervisor — multiple keepalive daemons across hosts coordinating via a shared task graph; cross-machine SendMessage. Single-host keepalive shipped here is the foundation
Dead-teammate respawn — depends on the multi-supervisor coordinator
Stuck-task watchdog — same dependency; status timeline already records claimed_at
Pattern-detection classifier — log every supervisor decision; offline-train on task_description → parallelism_safety. Logging is shipped; classifier is future work
The next concrete native-Teams refactor — a small Agent(..., persistent=True) flag + filesystem-backed reattachment on claude --resume. The on-disk schema this PR's daemon already uses (~/.claude/teams/<team>/agents/<name>.json) is the schema the future native flag would write to — zero migration when the binary catches up.

Known limitation — read first

This PR ships session-resistance for the WORK (via the keepalive daemon + filesystem kanban + persistent-agent state schema). It does NOT ship session-resistance for the native Agent tool's team-list UI — that surface requires changes inside the claude-code binary itself.

You exit Claude Code, run claude --resume → daemon still alive, dispatched workers still running, kanban state intact, global-mind transcript continues appending. ✅
You exit Claude Code, run claude --resume → native in-binary Agent spawns are gone from the team-list UI. ❌ Architectural limit of the current Agent tool — the proposed binary-side fix is in "What this PR ships, and what's next".

Deferred polish — to land after review

Two demo-only conveniences this PR keeps project-local rather than canonical, to keep the review diff scoped:

Demo venv location. try-swarm.sh creates .swarm-venv/ inside plugins/swarm-orchestrator/. Once production-ready, it should move to ~/.claude/claude-swarm/venv/ (matching the keepalive daemon's home at ~/.claude/swarm/). Keeping it local during review keeps install + cleanup symmetric with the rest of the plugin tree.
Demo model selection. The CLI's claude-swarm run --conductor claude forces claude-haiku-4-5 for every head so the demo finishes in <30 s at minimal token spend. Production callers wire model_override=None (Python API) or CLAUDE_SWARM_MODEL_OVERRIDE="" (env var) to use each head's role-default.

Breaking changes

None. No changes to TaskCreate / TaskUpdate / SendMessage / Team* schemas. No changes to existing plugins. No changes to vanilla Teams behavior. New hooks only fire when the plugin is installed AND the session matches a swarm context.

Test plan

Library (github.com/kushalj1997/claude-swarm):

94 / 94 tests pass (kanban, supervisor, conductor, abort marker, worktree manager, merge pipeline, messaging, reviewer checkpoint, agents)
mypy --strict clean + ruff check clean

Plugin (plugins/swarm-orchestrator/):

15 / 15 hook unit tests pass — python3 -m unittest tests.test_hooks -v
10 / 10 scenario tests pass — python3 tests/swarming/run_scenario.py --all
Plugin manifest validates against the marketplace schema
Every command + agent file has well-formed YAML frontmatter
Hooks exit 0 on malformed stdin (no-op for non-TaskUpdate / non-Builder events)

Demo verification (operator-facing try-swarm.sh path):

--stub mode completes in ~20 s, no LLM dispatch, no auth prompt
Default mode presents preflight banner + explicit consent prompt; refusing exits cleanly without dispatch
Auto-exits when kanban shows done ≥ 1 and pending = in_progress = 0

Maintainer review:

claude plugin validate plugins/swarm-orchestrator/
CI workflows pass on the PR branch

License

Open to your guidance. This plugin directory does not currently include a LICENSE file — matching the convention I observed across the existing 14 plugins in plugins/ (none ship per-plugin licenses; the repo's top-level LICENSE.md applies). The companion standalone library at https://github.com/kushalj1997/claude-swarm ships under Apache 2.0 explicitly.

CLA

Will sign Anthropic's standard CLA on submission if required.

References

Plugin structure follows plugins/feature-dev/ (commands + agents) and plugins/hookify/ (hooks layout)
DAG dispatch primitive is built on the existing blockedBy field of TaskCreate (no schema change)
Reviewer-checkpoint hook conceptually similar to plugins/learning-output-style/'s SessionStart injection but fires on Stop against turn-count config

A new plugin that layers DAG-aware multi-agent coordination on top of native Anthropic Teams — additive only, no schema changes to TaskCreate / TaskUpdate / SendMessage / Team*. Plugin surface (purely additive): - 8 slash commands: /swarm-spawn, /swarm-submit, /swarm-status, /swarm-start, /swarm-stop, /swarm-merge, /swarm-abort, /swarm-test - 6 role-typed subagents with tool-restriction frontmatter: Scanner, Builder, Test-Runner, Reviewer, Merger, Auditor - 2 lifecycle hooks: PostToolUse(TaskUpdate) for DAG cascade dispatch, Stop for periodic reviewer checkpoints - 3 worked examples: refactor, feature+review, multi-day audit - Marketplace registration in .claude-plugin/marketplace.json plus a row in plugins/README.md Designed to operate in two modes: standalone (lightweight kanban + JSON inbox routing, no Anthropic Teams required) or integrated with native Teams (where SendMessage / TaskCreate / TeamCreate already exist and the plugin layers DAG iterators + role-typed heads on top).

Testing substrate at plugins/swarm-orchestrator/tests/swarming/: - 10 self-contained, deterministic, sub-5-minute toy scenarios exercising every primitive: multi-file-rename, spec-impl-pair, scan-build-review, doc-writer-team, multi-language-port, audit-then-fix, conflict-resolution-drill, abort-marker-test, respawn-on-crash, multi-team-coordination - Canonical scenario JSON schema at schema/scenario.schema.json - Binding-agnostic ScenarioEngine protocol + reference InProcessScenarioEngine - 15 hook unit tests for the cascade + checkpoint hooks - All tests pass: 10/10 scenarios + 15/15 hook tests, no LLM dispatch (every scenario test runs against a deterministic in-process reference engine, so the test suite costs zero tokens) Design proposal at IMPROVEMENTS_OVER_VANILLA_TEAMS.md: - 45 documented primitives across core / reliability / observability / coordination / quality / docs / advanced layers - Multi-tier architecture: supervisor per team, meta-supervisor per host, head-typed agents - Explicit shipped-vs-deferred split (PR description's tables remain authoritative; design doc sketches the full target surface) - Comparison table vs vanilla Teams

…oard scripts/try-swarm.sh — canonical 'try it' entrypoint: - Three modes: --stub (free smoke test, ~20s), default real-LLM dispatch (~15-25s via Haiku with one-word prompts; explicit consent prompt), and --keepalive (supervisor runs as a detached daemon that survives CLI exit + 'claude --resume') - Preflight: checks the 'claude' CLI is installed + probes auth with a tiny Haiku ping. Failure prints two clear next-step paths (interactive login via Pro/Max/Team plan, or ANTHROPIC_API_KEY env var) and exits cleanly without dispatching anything - Force-reinstalls claude-swarm on every run so stale venv state from a previous demo can't silently use older code missing the latest flags - Parallel dispatch (3 tasks concurrent) so multiple in-progress rows render simultaneously — the 'live' demo feel - DAG ordering: scanner → builder → {reviewer, test-runner} → merger, so reviewer reviews the build's output (not running in parallel from t=0); test-runner runs in parallel with reviewer after builder - Cleanup trap covers EXIT, SIGINT (Ctrl+C), SIGTERM — pkill -P on the supervisor's children, 2s grace, then SIGKILL escalation. No orphans. - At exit, points to supervisor.log, global-mind.jsonl (every dispatch + cost increment as JSONL — the swarm's collective transcript), and the kanban cascade-events.jsonl scripts/swarm_dashboard.py — minimal TUI: - Modeled on the native Anthropic Teams agent-list interface - One header line: progress, runtime, tokens, aggregate cost - One row per role-typed head: status dot (●/○/✗), name, current task, status word, per-head tokens, per-head spend - Reads from claude-swarm list (plain text + optional --json), so it works against any installed version of the library The 'global-mind' framing: the JSONL transcript is the swarm's collective state-of-the-world, replayable for audit and observable in real time.

kushalj1997 force-pushed the feature/swarm-orchestrator-plugin branch from 6c81c49 to 969cb94 Compare May 10, 2026 20:19

kushalj1997 force-pushed the feature/swarm-orchestrator-plugin branch from 9f516c4 to bc75369 Compare May 10, 2026 21:26

kushalj1997 marked this pull request as ready for review May 10, 2026 22:10

kushalj1997 force-pushed the feature/swarm-orchestrator-plugin branch 2 times, most recently from 2efddd3 to 959482b Compare May 10, 2026 22:40

kushalj1997 force-pushed the feature/swarm-orchestrator-plugin branch from 959482b to 389f885 Compare May 10, 2026 22:50

kushalj1997 added 3 commits May 10, 2026 15:51

kushalj1997 force-pushed the feature/swarm-orchestrator-plugin branch from 389f885 to f855f82 Compare May 10, 2026 22:51

This was referenced May 11, 2026

📊 AI CLI 工具社区动态日报 2026-05-11 zx0828/big_model_radar#49

Open

📊 AI CLI 工具社区动态日报 2026-05-11 gsscsd/big_model_radar#325

Open

📊 AI CLI 工具社区动态日报 2026-05-11 ivanweng2077/big_model_radar#25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autonomous Claude Swarms -- Teams Improvement => feat(plugins): swarm-orchestrator — DAG-aware multi-tier coordination with role-typed heads for autonomous agent teams#57880

Autonomous Claude Swarms -- Teams Improvement => feat(plugins): swarm-orchestrator — DAG-aware multi-tier coordination with role-typed heads for autonomous agent teams#57880
kushalj1997 wants to merge 3 commits into
anthropics:mainfrom
kushalj1997:feature/swarm-orchestrator-plugin

kushalj1997 commented May 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kushalj1997 commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Autonomous Claude Swarms and Agent Teams Improvement

Summary

What the dashboard looks like

Pertinent — Agent Tool

Where it started — a Kanban board

Slash commands reference

claude-swarm CLI reference

Dependency on the claude-swarm library

Try it

How to review this PR

Native Teams limitations addressed (the motivation)

Authentication — how the demo gets credentials

Plugin surface

Self-healing properties

Session-resistance — the keepalive daemon

Operator-facing test plan — session-resistance proof

Global mind — the swarm's collective transcript

Architecture stack

What this PR ships, and what's next

Known limitation — read first

Deferred polish — to land after review

Breaking changes

Test plan

License

CLA

References

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kushalj1997 commented May 10, 2026 •

edited

Loading

`claude-swarm` CLI reference

Dependency on the `claude-swarm` library