OpenRouter for agents and containers.
Agent Machines is the product layer above sandboxes: a control plane that deploys a persistent agent worker as one unit (runtime, skills, MCP, integrations, cron, observation, and fleet management) on any substrate you route to.
People don't want a bare sandbox. They want a worker that audits code, runs on a schedule, and gets smarter over time. This repo is the control plane, not the agent runtime package.
Live site: https://www.agent-machines.dev Source: https://github.com/Kevin-Liu-01/agent-machines
- What it is
- The core idea: dual routing
- Browser Agent Console
- Deploy → Bootstrap → Attach → Talk
- The streaming gateway
- The harness
- Architecture
- Provider capability matrix
- Quick start
- CLI
- Web app
- Repository layout
- Data boundaries and security
- Further reading
One account to route which agent runtime and which substrate, then receive a full persistent worker instead of an empty box.
| Analogy | Meaning |
|---|---|
| OpenRouter for agents and containers | One account routes which runtime and which substrate, the way OpenRouter routes models |
| Vercel on AWS | We are the product layer; substrates are interchangeable infrastructure underneath |
Building sandboxes is hard. We route across existing ones instead of rebuilding infra, and keep the value in the harness, the control plane, and observation.
Two audiences:
- Humans operate the dashboard: route runtime + substrate, provision presets, supervise the fleet.
- Other agents (endgame) drive an MCP + CLI surface so a head agent can provision, route, observe, and tear down workers.
Most products lock you into one runtime or one cloud. Agent Machines routes both axes independently. Every substrate implements a single MachineProvider interface (provision / state / wake / sleep / destroy / exec / streamExec), so the rest of the system is provider-agnostic.
| Axis | Options | Abstraction |
|---|---|---|
| Agent runtime | Hermes, OpenClaw, Claude Code, Codex CLI | bootstrap phase recipes + launch commands |
| Substrate | E2B, Sprites.dev, Vercel Sandbox, Dedalus Machines | MachineProvider (web/lib/providers/*) |
| Model upstream | Dedalus router, OpenRouter, native OpenAI / Anthropic keys, OpenAI-compatible gateway | router presets + per-machine credential gate |
A credential gate blocks provisioning when the chosen runtime has no usable model upstream or the substrate has no key, so spin-up never fails silently downstream.
The headline capability and the hardest engineering problem in the repo: you operate the real agent CLI (Codex, Claude Code, Hermes, OpenClaw) from a browser tab, on a remote worker, with no local terminal and no tunnel.
One browser console, two runtimes: Hermes Agent (left) and OpenAI Codex CLI (right). Each is the real CLI attached over tmux-over-exec, not a chat reskin.
The most capable agent tools ship as terminal programs (CLIs), not chat widgets. A live browser terminal normally requires a long-lived WebSocket PTY server in the middle, relaying every keystroke. A serverless control plane (Next.js on Vercel) cannot be that: functions time out (~110s), cold-start, and have no sticky sessions. So teams usually pick a worse option:
- assume a local terminal (
claude/codexon your laptop), excluding non-terminal users, or - wrap the agent in a chat UI and lose the full TUI/CLI experience, or
- ship their own terminal locked to their own sandbox infra, or
- bolt on a relay/tunnel they then have to operate.
Stop hosting the session in the API. Put the session on the worker; keep the control plane stateless.
- Session lives on the box. A persistent
tmuxsession (amconsole) holds the PTY, the agent process, and scrollback. It survives serverless cold starts and function timeouts. - Control plane is stateless HTTP + SSE. No WebSocket server on Vercel; no tunnel on the console path.
- Input is
tmux send-keys -H <hex>(one quick exec per coalesced keystroke batch). - Output is an unbuffered
tail -fon the pane log, streamed back over SSE from a byte offset. - Resize is a single
tmux resize-windowexec.
exec is the only primitive each substrate must provide, so the same UI works across Dedalus, E2B, Sprites, and Vercel.
Browser (xterm.js)
| keystrokes --POST--> /api/dashboard/terminal/input (tmux send-keys -H)
| output <--SSE--- /api/dashboard/terminal/stream (unbuffered tail -f)
v
Next.js control plane (Clerk auth, resolve machine + provider creds) -- stateless, no PTY here
v
provider.exec / provider.streamExec
v
Remote VM: tmux "amconsole" + pipe-pane --> /tmp/am-console.log
v
Agent CLI running inside the pane (codex | claude | hermes | openclaw)The closest predecessor is AWS CloudShell, a real browser terminal, but it is locked to the AWS ecosystem. This is CloudShell-shaped, substrate-agnostic, wired to agent CLIs, on a stack that structurally cannot host a PTY server. The tmux-over-exec + SSE pattern is independent of agents and could stand alone as a serverless browser-terminal primitive.
The interactive console is tuned to feel close to local:
- parallel xterm bundle load and
tmuxsession attach, - snapshot paint on connect (
capture-pane), so the screen is never blank, requestAnimationFrame-batched writes to xterm,- immediate flush for control keys (Enter, arrows, Ctrl, Tab) and an 8ms coalesce window for printable text,
- input POSTs serialized through one promise chain so keystrokes never reorder,
- per-request
getUserConfigcache (2.5s) and machine-state cache (3s), - E2B sandbox connect reuse (45s) within a warm serverless instance,
tmuxpre-installed during bootstrap so the first attach never triggers a package install.
Full write-up: knowledge/BROWSER-AGENT-CONSOLE.md. Engineering spec: web/docs/sandbox-terminal-gateway.md.
The one-click flow (DeployAndTalk) chains four stages:
- Provision.
POST /api/dashboard/admin/provision-machinecreates the machine through the selectedMachineProviderand records aMachineRef(provider, runtime, spec, model, router) in the user's config. The credential gate runs first. - Bootstrap.
POST /api/dashboard/admin/bootstrapruns phase-aligned shell recipes (web/lib/bootstrap/runner.ts): system deps, runtime install, agent configuration, gateway launch, then best-effort post-gateway phases. Each phase wraps its command, tees tobootstrap.logon the VM, and persistsbootstrapStateafter every step so progress survives failures. - Attach. The browser opens the terminal page with
?launch=1, attaches thetmuxconsole, and paints the pane snapshot. - Talk. The agent CLI auto-launches inside the pane and you interact line by line, including full-screen TUIs.
Bootstrap phases are split into CORE_BOOTSTRAP_PHASES (must succeed; gateway marked ready) and POST_GATEWAY_BOOTSTRAP_PHASES (best-effort, e.g. browser tooling) so a slow optional install can't block the agent from coming online.
Two output paths reuse the same SSE event contract (started · output · idle · error) so the UI is identical regardless of substrate capability.
- Interactive console (
/api/dashboard/terminal/*): live PTY over tmux-over-exec, described above. - One-shot exec (
/api/dashboard/exec/stream) and bootstrap tail (/api/dashboard/bootstrap/stream): run a command, stream stdout/stderr while it runs.
Streaming prefers each provider's native primitive and only falls back to log-tail polling for substrates that physically cannot stream. The callback-to-generator adapter lives in web/lib/providers/stream-util.ts (bridgeExecStream).
The agent's HTTP chat gateway (Hermes :8642 / OpenClaw :18789) is now optional. The console path is exec-first and needs no public URL or Cloudflare tunnel; POST /api/chat degrades gracefully to "use the Terminal console" when a machine has no public gateway URL. Machine bearer tokens never become NEXT_PUBLIC_*.
A worker is a runtime plus a composable harness. Counts are derived from the registry (web/lib/platform/harness.ts), not hard-coded.
| Layer | Source | Notes |
|---|---|---|
| Skills | knowledge/skills/<name>/SKILL.md |
161 skills; synced to ~/.agent-machines/skills/ on deploy/reload |
| MCP servers | knowledge/mcps catalog |
35 servers; credential-gated (Vercel, Stripe, Supabase, Clerk, Figma, PostHog, Sentry, Datadog, Linear, Slack, GitHub, and more) |
| Service routes | loadout registry | MCP → CLI → skill preference per vendor |
| CLIs | bootstrap install | agent-browser, Playwright, gh, curl, jq, sqlite3, and more |
| Agent-native tools | per runtime | vary by runtime; Hermes is richest (terminal, fs, browser, vision, cron, memory, delegate) |
Skills follow the SKILL.md protocol: procedures saved to the machine compound over time and cannot be exported out of a stateless chat product.
you
| browser / CLI / API
v
Next.js control plane ---------------- CLI: deploy / chat / reload
| Clerk-backed UserConfig (keys, machines, routers)
v
MachineProvider ---------------- E2B | Sprites | Vercel Sandbox | Dedalus
| provision / state / wake / sleep / destroy / exec / streamExec
v
persistent worker (provider home: /home/user | /home/sprite | /vercel/sandbox | /home/machine)
|
|-- tmux "amconsole" interactive browser console (PTY over exec)
|-- :8642 / :18789 gateway optional HTTP chat (exec-first; no tunnel required)
|-- ~/.agent-machines/ skills, mcps, chats, crons, sessions, logs, artifacts
|-- <home>/agent-machines/ git checkout, used for knowledge reload
v
model upstream (Dedalus router | OpenRouter | native OpenAI/Anthropic | OpenAI-compatible)Every substrate implements MachineProvider; streaming tier depends on the SDK.
| Substrate | streamExec primitive |
Streaming tier |
|---|---|---|
| E2B | commands.run(cmd, { onStdout, onStderr }), bridged to a generator |
native stream |
| Sprites | spawn() process stdout / stderr Readables, bridged |
native stream |
| Vercel Sandbox | Command.logs() async iterator on a detached command |
native stream |
| Dedalus | none (REST exec returns output only after completion) | poll fallback |
Native tiers relay output frame by frame with no extra exec calls. The poll fallback launches a detached command, tees combined output to a temp log, and polls new bytes until an exit-marker file appears.
git clone https://github.com/Kevin-Liu-01/agent-machines
cd agent-machines
cp .env.example .env
npm install
npm run deploy # CLI path (Hermes); or use /dashboard/setup for any providerRequires Node >= 20.
npm run deploy # provision + bootstrap Hermes
npm run deploy:openclaw # provision + bootstrap OpenClaw
npm run chat -- "message" # chat with the active machine's gateway
npm run status # active machine state
npm run logs # tail gateway logs
npm run shell # exec a shell command on the machine
npm run wake / sleep / destroy -- --yes
npm run reload # git-pull the repo on the VM and re-sync knowledge
npm run doctor # environment + machine health checks
npm run benchmark # cross-substrate boot/exec/IO benchmarkscd web
cp .env.local.example .env.local
npm install
npm run devOpen http://localhost:3210.
Configure Clerk for authenticated routes:
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=...
CLERK_SECRET_KEY=...Production note: use Clerk production keys (
pk_live_…/sk_live_…) on the deployed domain. Development keys carry strict rate limits and store metadata on a separate instance.
| Route | Purpose |
|---|---|
/ |
landing: fleet demo, activity grid, loadout, architecture |
/dashboard |
fleet overview, gateway health, telemetry |
/dashboard/setup |
route runtime + substrate, credentials, provision |
/dashboard/machines |
fleet supervision, per-machine focus |
/dashboard/machines/[id]/terminal |
Browser Agent Console (interactive + one-shot) |
/dashboard/machines/[id]/chat |
gateway chat for a machine |
/dashboard/loadout |
skills, MCP catalog, service and task routes |
/dashboard/skills /mcps /cron /logs /sessions /artifacts |
harness + observation surfaces |
agent-machines/
src/ CLI + bootstrap (tsx)
web/ Next.js control plane (site + dashboard + provider adapters)
app/api/dashboard/terminal/* interactive console routes
lib/providers/* MachineProvider implementations
lib/bootstrap/runner.ts browser bootstrap (phase recipes)
lib/dashboard/terminal-session.ts tmux-over-exec session logic
docs/sandbox-terminal-gateway.md engineering spec
knowledge/ skills, mcps, VISION.md, AGENTS.md, BROWSER-AGENT-CONSOLE.md
mcp/ cursor-bridge MCP server- Clerk private metadata stores provider API keys, the Cursor key, gateway bearer tokens, and the full
UserConfig. - Clerk public metadata only exposes redacted setup state and machine summaries.
- All agent state (runtime, app data, skills, sessions, crons, config) lives under
<provider home>/.agent-machines/. - The VM repo checkout (
<provider home>/agent-machines/) is used only for knowledge reloads. - Machine gateway bearers are server-only and never shipped to the client as
NEXT_PUBLIC_*. - App-Router error boundaries (
app/dashboard/error.tsx,app/global-error.tsx) keep a single component crash from white-screening the dashboard.
| Doc | What it covers |
|---|---|
knowledge/VISION.md |
product vision and defensibility |
knowledge/BROWSER-AGENT-CONSOLE.md |
full Browser Agent Console architecture + positioning |
knowledge/BROWSER-AGENT-CONSOLE-EXPLAINER.md |
four-paragraph plain-language explainer |
knowledge/AGENT-MACHINES-EXPLAINER.md |
three-paragraph whole-product explainer |
web/docs/sandbox-terminal-gateway.md |
streaming gateway engineering spec |
web/README.md |
control-plane app details |
MIT.

