Multi-Fleet

Cross-machine AI collaboration for Claude Code, Cursor, VS Code, Codex, and Gemini.

Real-time peer-to-peer messaging with a 9-priority self-healing fallback chain, session-aware autonomous task agents, HMAC-signed communication, and fleet-wide productivity visibility. Messages always deliver — even when NATS is down, HTTP is blocked, and SSH is your only path.

Quick Start · Architecture · Demo · Install · Docs

The Vision: Orchestrated Coding at Any Scale

A single developer with a coordinated fleet of AI agents will out-ship a team of ten with isolated IDEs.

Multi-Fleet is built on one belief: the bottleneck in software is no longer typing — it's coordination. When you have one AI assistant in one IDE, you're driving a car. When you have a hundred AI assistants running on a hundred machines, all talking to each other, all aware of what the others just shipped — you're not driving anymore. You're conducting an orchestra.

Scale	What it unlocks
1 machine	Persistent context across sessions, never re-explain your codebase
2 machines	Background agent on machine #2 reviews every PR you push from #1 — instant second opinion
3–5 machines	A team of AI agents you own. One races to fix the bug, one writes the test, one updates the docs. Best result wins.
10+ machines	A swarm. Refactor your entire monorepo overnight. Each agent owns a directory. Failures auto-redistribute.
100+ machines	A coding datacenter. Continuous fleet-wide refactor. AI agents propose changes 24/7. You wake up to a stack of evidence-backed PRs.

The protocol scales linearly. The architecture scales horizontally. The only ceiling is your imagination.

Why Multi-Fleet?

You have three Macs. Two of them are working on your codebase right now. The third is sleeping. One has the database. One has the GPU. One has your IDE open.

Without Multi-Fleet: You manually ssh into each machine, copy-paste commands, lose context, forget which session knows what. When one machine drops off Wi-Fi, your workflow stops.

With Multi-Fleet: Your AI assistant on mac1 sends a task to mac2, mac2 picks it up in its own Claude Code session, mac3 wakes from sleep to run the GPU job. If NATS goes down mid-conversation, the message reroutes through HTTP. If HTTP fails, it falls through SSH. The message gets there.

   "@mac3 train the model overnight"
            │
            ▼
   ┌──────────────────────────────────────────────────┐
   │  mac1 (you)  ◀── 9-priority cascade ──▶  mac3   │
   │                                                  │
   │  P0  Discord/Cloud                               │
   │  P1  NATS pub/sub (clustered, primary)           │
   │  P2  HTTP direct (both daemons up)               │
   │  P3  Chief relay (one peer reachable)            │
   │  P4  Seed file (SSH-write to inbox)              │
   │  P5  SSH direct (keys configured)                │
   │  P6  Wake-on-LAN (target asleep)                 │
   │  P7  Git push (last resort, always works)        │
   │  P8  Local IPC (Superset terminal-host.sock)     │
   └──────────────────────────────────────────────────┘

Features


🛰️ 9-priority cascade	NATS → HTTP → relay → seed → SSH → WoL → git → IPC. First channel that works, wins.
🔄 Self-healing	Channels that come back online are automatically re-prioritized. Broken paths trigger repair via working ones.
🤖 Session-aware agents	Tasks remember which VS Code window / Claude Code session they came from. Reply routing is automatic.
🔐 HMAC-signed	Every packet is signed. Replay-protected. Optional E2E encryption via age-keys.
🩺 Health observable	`/health` endpoint, per-channel counters, JetStream stream stats, per-peer last-seen.
🧬 LLM-native	Built for Claude Code, Cursor, Codex, Gemini, VS Code. Plugin packages included.
📊 Fleet visibility	Productivity races, leaderboards, evidence streams. See what every node is doing in real time.
🛡️ Zero Silent Failures	Every error path bumps a named counter. No `except: pass`. Ever.

Quick Start

Install

# pip install (single node)
pip install multi-fleet

# Or clone for development
git clone https://github.com/supportersimulator/multi-fleet
cd multi-fleet && pip install -e .

Run a node

export MULTIFLEET_NODE_ID=mac1
export NATS_URL=nats://127.0.0.1:4222
python3 -m multifleet.daemon serve

Send a message

curl -X POST http://127.0.0.1:8855/message \
  -H "Content-Type: application/json" \
  -d '{
    "type":    "context",
    "to":      "mac2",
    "payload": {"subject":"deploy","body":"push the landing page"}
  }'

Watch fleet health

curl -s http://127.0.0.1:8855/health | jq
bash scripts/fleet-check.sh    # full dashboard
bash scripts/fleet-summary.sh  # one-line status bar

That's it. Two nodes with MULTIFLEET_NODE_ID set to different values, both connected to the same NATS (or both reachable via any of P2–P7), and you have a fleet.

Architecture

    ┌─────────────────────────────────────────────────────────────┐
    │                      Your Machine (mac1)                    │
    │  ┌────────────┐    ┌──────────────┐    ┌─────────────────┐  │
    │  │ Claude Code├───▶│  HTTP :8855  │───▶│ ChannelProtocol │  │
    │  │ /Cursor/etc│    │   /message   │    │  (9 channels)   │  │
    │  └────────────┘    └──────────────┘    └────────┬────────┘  │
    │                                                  │           │
    │  ┌──────────────────────────────────────────────▼────────┐  │
    │  │   NATS  │  HTTP  │  SSH  │  Git  │  WoL  │  IPC  │ ...│  │
    │  └────────┬─────────────────────────────────────────────┘  │
    └───────────┼──────────────────────────────────────────────────┘
                │
                ▼  (whichever channel is healthy)
    ┌─────────────────────────────────────────────────────────────┐
    │                      Peer Machine (mac2)                    │
    │                                                             │
    │   Inbound packet  ▶  inbox  ▶  hook  ▶  Claude Code session │
    └─────────────────────────────────────────────────────────────┘

See ARCHITECTURE.md for the full deep dive: JetStream replication, MFINV invariants C01–C07, fleet-state KV, plist drift detection, and the self-heal loop.

The Three Invariants

These cannot be relaxed:

ZSF — Zero Silent Failures. Every exception path bumps an observable counter. except Exception: pass is forbidden.
No Single Point of Failure. No channel, daemon, or peer is required. The cascade always has a fallback.
No Polling. Event-driven everywhere. Polling loops fail the test_no_polling_invariant.py gate.

Fleet-State KV

Multi-Fleet ships with a JetStream-backed KV store (fleet_roster) that every node reads and writes. It tracks who's alive, who's chief, who has which capability, and which channels are working between which peers.

nats kv get fleet_roster mac2 --raw | jq

{
  "node_id":        "mac2",
  "last_seen":      "2026-05-13T21:14:02Z",
  "capabilities":   ["nats","http","ssh","git"],
  "peers_seen":     ["mac1","mac3"],
  "git_head":       "865f4a8...",
  "claude_session": "fab27887..."
}

This is what makes the cascade smart: each node knows which channels work to which peer right now, and picks the cheapest healthy one first.

Use Cases

Distributed AI development — Run Claude Code on 3 machines, route subtasks to whichever has the right context/GPU/database
Always-available context — Your IDE on the laptop tells the desktop "remember this for tomorrow"; the desktop persists it even if the laptop closes
Auto-recovering pipelines — CI/CD steps that don't fail when one box loses Wi-Fi for 90 seconds
Pair-programming with your own fleet — A second AI on a second machine reviews PRs while you keep coding
Race coordination — Multiple AI agents compete on the same task; first to finish wins, others learn from the winner
Swarm refactor — Carve up a monorepo across N machines, each agent claims a directory, wake up to a stack of tested PRs
Continuous background review — Every commit you push triggers a fleet-wide validation pass on every other node
Geographically distributed teams — One cluster across home office, cabin laptop, cloud node, and a collaborator's machine on another continent
Solo-founder force multiplier — One person, N machines. Don't hire — provision.

Integrations

Tool	Plugin	Status
Claude Code	plugin (this repo)	✅ Stable
Cursor	MCP server (`tools/multifleet_mcp.py`)	✅ Stable
VS Code	extension host bridge	✅ Stable
Codex	`codex-config.toml.example`	✅ Stable
Gemini	`gemini-extension.json`	✅ Stable
Superset terminal-host	P8 IPC	🧪 Experimental

Zero Silent Failures

Every failure path in Multi-Fleet bumps a named counter. You can grep the codebase: there is no except: pass. The CI gate enforces this.

curl -s http://127.0.0.1:8855/health | jq '.counters'

{
  "nats_publish_errors_total":          0,
  "http_send_errors_total":             2,
  "ssh_seed_write_errors_total":        0,
  "channel_cascade_fallback_total":     14,
  "self_heal_repair_attempts_total":    3,
  "self_heal_repair_success_total":     3,
  "webhook_offsite_nats_publish_total": 1024
}

If you see a counter climbing, you have a real bug. If you see all zeros, your fleet is healthy.

Roadmap

Documentation

ARCHITECTURE.md — How it works, the deep dive
INSTALL.md — macOS / Linux / Windows install
DEMO.md — Try it in 5 minutes
COMMS-PROTOCOL.md — Packet format, signing, replay protection
SCALING.md — Beyond 3 nodes
CONTRIBUTING.md — How to help
CODE_OF_CONDUCT.md — Be kind

Status

Multi-Fleet powers a working 4-node fleet (mac1, mac2, mac3, cloud) running 24/7 across home network, cellular, and AWS. It has shipped through real network partitions, sleep/wake cycles, NATS server restarts, and intermittent Wi-Fi without dropping a message.

It is production-tested at small scale. We invite you to test it at yours.

License

MIT — do anything you want, just keep the copyright notice. See LICENSE.

Built by humans who got tired of ssh user@mac2 'pkill -9 daemon'.

⭐ Star this repo if you've ever had three machines and wished they talked to each other.

Name		Name	Last commit message	Last commit date
Latest commit History 382 Commits
.claude-plugin		.claude-plugin
.cursor-plugin		.cursor-plugin
.github		.github
.vscode		.vscode
agents		agents
bin		bin
commands		commands
config		config
context/kits		context/kits
docs		docs
examples		examples
hooks		hooks
multifleet		multifleet
packet-types		packet-types
scripts		scripts
skills		skills
tests		tests
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.mcp.json		.mcp.json
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
COMMS-PROTOCOL.md		COMMS-PROTOCOL.md
CONTRIBUTING.md		CONTRIBUTING.md
DEMO.md		DEMO.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
RELEASE-NOTES.md		RELEASE-NOTES.md
SCALING.md		SCALING.md
codex-config.toml.example		codex-config.toml.example
gemini-extension.json		gemini-extension.json
icon.svg		icon.svg
mkdocs.yml		mkdocs.yml
package.json		package.json
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Fleet

The Vision: Orchestrated Coding at Any Scale

Why Multi-Fleet?

Features

Quick Start

Install

Run a node

Send a message

Watch fleet health

Architecture

The Three Invariants

Fleet-State KV

Use Cases

Integrations

Zero Silent Failures

Roadmap

Documentation

Status

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-Fleet

The Vision: Orchestrated Coding at Any Scale

Why Multi-Fleet?

Features

Quick Start

Install

Run a node

Send a message

Watch fleet health

Architecture

The Three Invariants

Fleet-State KV

Use Cases

Integrations

Zero Silent Failures

Roadmap

Documentation

Status

License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages