Cross-machine AI collaboration for Claude Code, Cursor, VS Code, Codex, and Gemini.
Real-time peer-to-peer messaging with a 9-priority self-healing fallback chain, session-aware autonomous task agents, HMAC-signed communication, and fleet-wide productivity visibility. Messages always deliver — even when NATS is down, HTTP is blocked, and SSH is your only path.
Quick Start · Architecture · Demo · Install · Docs
A single developer with a coordinated fleet of AI agents will out-ship a team of ten with isolated IDEs.
Multi-Fleet is built on one belief: the bottleneck in software is no longer typing — it's coordination. When you have one AI assistant in one IDE, you're driving a car. When you have a hundred AI assistants running on a hundred machines, all talking to each other, all aware of what the others just shipped — you're not driving anymore. You're conducting an orchestra.
| Scale | What it unlocks |
|---|---|
| 1 machine | Persistent context across sessions, never re-explain your codebase |
| 2 machines | Background agent on machine #2 reviews every PR you push from #1 — instant second opinion |
| 3–5 machines | A team of AI agents you own. One races to fix the bug, one writes the test, one updates the docs. Best result wins. |
| 10+ machines | A swarm. Refactor your entire monorepo overnight. Each agent owns a directory. Failures auto-redistribute. |
| 100+ machines | A coding datacenter. Continuous fleet-wide refactor. AI agents propose changes 24/7. You wake up to a stack of evidence-backed PRs. |
The protocol scales linearly. The architecture scales horizontally. The only ceiling is your imagination.
You have three Macs. Two of them are working on your codebase right now. The third is sleeping. One has the database. One has the GPU. One has your IDE open.
Without Multi-Fleet: You manually ssh into each machine, copy-paste commands, lose context, forget which session knows what. When one machine drops off Wi-Fi, your workflow stops.
With Multi-Fleet: Your AI assistant on mac1 sends a task to mac2, mac2 picks it up in its own Claude Code session, mac3 wakes from sleep to run the GPU job. If NATS goes down mid-conversation, the message reroutes through HTTP. If HTTP fails, it falls through SSH. The message gets there.
"@mac3 train the model overnight"
│
▼
┌──────────────────────────────────────────────────┐
│ mac1 (you) ◀── 9-priority cascade ──▶ mac3 │
│ │
│ P0 Discord/Cloud │
│ P1 NATS pub/sub (clustered, primary) │
│ P2 HTTP direct (both daemons up) │
│ P3 Chief relay (one peer reachable) │
│ P4 Seed file (SSH-write to inbox) │
│ P5 SSH direct (keys configured) │
│ P6 Wake-on-LAN (target asleep) │
│ P7 Git push (last resort, always works) │
│ P8 Local IPC (Superset terminal-host.sock) │
└──────────────────────────────────────────────────┘
| 🛰️ 9-priority cascade | NATS → HTTP → relay → seed → SSH → WoL → git → IPC. First channel that works, wins. |
| 🔄 Self-healing | Channels that come back online are automatically re-prioritized. Broken paths trigger repair via working ones. |
| 🤖 Session-aware agents | Tasks remember which VS Code window / Claude Code session they came from. Reply routing is automatic. |
| 🔐 HMAC-signed | Every packet is signed. Replay-protected. Optional E2E encryption via age-keys. |
| 🩺 Health observable | /health endpoint, per-channel counters, JetStream stream stats, per-peer last-seen. |
| 🧬 LLM-native | Built for Claude Code, Cursor, Codex, Gemini, VS Code. Plugin packages included. |
| 📊 Fleet visibility | Productivity races, leaderboards, evidence streams. See what every node is doing in real time. |
| 🛡️ Zero Silent Failures | Every error path bumps a named counter. No except: pass. Ever. |
# pip install (single node)
pip install multi-fleet
# Or clone for development
git clone https://github.com/supportersimulator/multi-fleet
cd multi-fleet && pip install -e .export MULTIFLEET_NODE_ID=mac1
export NATS_URL=nats://127.0.0.1:4222
python3 -m multifleet.daemon servecurl -X POST http://127.0.0.1:8855/message \
-H "Content-Type: application/json" \
-d '{
"type": "context",
"to": "mac2",
"payload": {"subject":"deploy","body":"push the landing page"}
}'curl -s http://127.0.0.1:8855/health | jq
bash scripts/fleet-check.sh # full dashboard
bash scripts/fleet-summary.sh # one-line status barThat's it. Two nodes with MULTIFLEET_NODE_ID set to different values, both connected to the same NATS (or both reachable via any of P2–P7), and you have a fleet.
┌─────────────────────────────────────────────────────────────┐
│ Your Machine (mac1) │
│ ┌────────────┐ ┌──────────────┐ ┌─────────────────┐ │
│ │ Claude Code├───▶│ HTTP :8855 │───▶│ ChannelProtocol │ │
│ │ /Cursor/etc│ │ /message │ │ (9 channels) │ │
│ └────────────┘ └──────────────┘ └────────┬────────┘ │
│ │ │
│ ┌──────────────────────────────────────────────▼────────┐ │
│ │ NATS │ HTTP │ SSH │ Git │ WoL │ IPC │ ...│ │
│ └────────┬─────────────────────────────────────────────┘ │
└───────────┼──────────────────────────────────────────────────┘
│
▼ (whichever channel is healthy)
┌─────────────────────────────────────────────────────────────┐
│ Peer Machine (mac2) │
│ │
│ Inbound packet ▶ inbox ▶ hook ▶ Claude Code session │
└─────────────────────────────────────────────────────────────┘
See ARCHITECTURE.md for the full deep dive: JetStream replication, MFINV invariants C01–C07, fleet-state KV, plist drift detection, and the self-heal loop.
These cannot be relaxed:
- ZSF — Zero Silent Failures. Every exception path bumps an observable counter.
except Exception: passis forbidden. - No Single Point of Failure. No channel, daemon, or peer is required. The cascade always has a fallback.
- No Polling. Event-driven everywhere. Polling loops fail the
test_no_polling_invariant.pygate.
Multi-Fleet ships with a JetStream-backed KV store (fleet_roster) that every node reads and writes. It tracks who's alive, who's chief, who has which capability, and which channels are working between which peers.
nats kv get fleet_roster mac2 --raw | jq{
"node_id": "mac2",
"last_seen": "2026-05-13T21:14:02Z",
"capabilities": ["nats","http","ssh","git"],
"peers_seen": ["mac1","mac3"],
"git_head": "865f4a8...",
"claude_session": "fab27887..."
}This is what makes the cascade smart: each node knows which channels work to which peer right now, and picks the cheapest healthy one first.
- Distributed AI development — Run Claude Code on 3 machines, route subtasks to whichever has the right context/GPU/database
- Always-available context — Your IDE on the laptop tells the desktop "remember this for tomorrow"; the desktop persists it even if the laptop closes
- Auto-recovering pipelines — CI/CD steps that don't fail when one box loses Wi-Fi for 90 seconds
- Pair-programming with your own fleet — A second AI on a second machine reviews PRs while you keep coding
- Race coordination — Multiple AI agents compete on the same task; first to finish wins, others learn from the winner
- Swarm refactor — Carve up a monorepo across N machines, each agent claims a directory, wake up to a stack of tested PRs
- Continuous background review — Every commit you push triggers a fleet-wide validation pass on every other node
- Geographically distributed teams — One cluster across home office, cabin laptop, cloud node, and a collaborator's machine on another continent
- Solo-founder force multiplier — One person, N machines. Don't hire — provision.
| Tool | Plugin | Status |
|---|---|---|
| Claude Code | plugin (this repo) | ✅ Stable |
| Cursor | MCP server (tools/multifleet_mcp.py) |
✅ Stable |
| VS Code | extension host bridge | ✅ Stable |
| Codex | codex-config.toml.example |
✅ Stable |
| Gemini | gemini-extension.json |
✅ Stable |
| Superset terminal-host | P8 IPC | 🧪 Experimental |
Every failure path in Multi-Fleet bumps a named counter. You can grep the codebase: there is no except: pass. The CI gate enforces this.
curl -s http://127.0.0.1:8855/health | jq '.counters'{
"nats_publish_errors_total": 0,
"http_send_errors_total": 2,
"ssh_seed_write_errors_total": 0,
"channel_cascade_fallback_total": 14,
"self_heal_repair_attempts_total": 3,
"self_heal_repair_success_total": 3,
"webhook_offsite_nats_publish_total": 1024
}If you see a counter climbing, you have a real bug. If you see all zeros, your fleet is healthy.
- 9-priority channel cascade
- JetStream replication (R=3 across cluster)
- Fleet-state KV with auto-sync
- Self-heal loop with cardinal-direction repair
- HMAC signing
- Leaf-node mode for off-network reliability
- Discord P0 channel for emergency relay
- WireGuard auto-mesh for zero-config private networking
- Native Windows daemon (currently macOS/Linux first-class)
- Browser extension for in-tab fleet visibility
- ARCHITECTURE.md — How it works, the deep dive
- INSTALL.md — macOS / Linux / Windows install
- DEMO.md — Try it in 5 minutes
- COMMS-PROTOCOL.md — Packet format, signing, replay protection
- SCALING.md — Beyond 3 nodes
- CONTRIBUTING.md — How to help
- CODE_OF_CONDUCT.md — Be kind
Multi-Fleet powers a working 4-node fleet (mac1, mac2, mac3, cloud) running 24/7 across home network, cellular, and AWS. It has shipped through real network partitions, sleep/wake cycles, NATS server restarts, and intermittent Wi-Fi without dropping a message.
It is production-tested at small scale. We invite you to test it at yours.
MIT — do anything you want, just keep the copyright notice. See LICENSE.
Built by humans who got tired of ssh user@mac2 'pkill -9 daemon'.
⭐ Star this repo if you've ever had three machines and wished they talked to each other.