Run Hermes — the Nous Research personal AI assistant — inside a Cloudflare Sandbox container, fronted by a Cloudflare Worker.
Experimental — Not officially endorsed by Nous Research or Cloudflare. This is a community project. Hermes upstream may break this template at any time; pin the
HERMES_VERSIONvalue incontainer/Dockerfileif you need stability.
A minimal, single-tenant Cloudflare Worker that:
- builds a Docker image with the Hermes Agent installed,
- runs that image inside a Cloudflare Sandbox container managed by a Durable Object,
- exposes Hermes' OpenAI-compatible API at
/v1/chat/completions, and - (optionally) reverse-proxies Hermes' native web dashboard on its own subdomain.
Provider API keys (Anthropic / OpenRouter / OpenAI) are brought by you (BYOK) — they live as Cloudflare secrets and are injected at process start. The Worker never persists them and never observes the messages flowing through /v1/chat/completions.
You get a personal Hermes that:
- sleeps when idle (Sandbox suspends the container after 4 hours of inactivity),
- wakes on demand when the next chat request arrives,
- survives sleep with Hermes session/cron state preserved under
~/.hermes/inside the container, - runs anywhere Cloudflare runs (no VPS, no Docker daemon on your machine).
- A Cloudflare account with Workers Paid plan (containers require Workers Paid).
wrangler3.95.0 or newer.- Docker Desktop (or compatible) running locally — Cloudflare builds the container image from
container/Dockerfileduringwrangler deploy. - At least one provider API key:
- Anthropic (Claude models), or
- OpenRouter (multi-provider routing), or
- OpenAI (GPT models).
These numbers come from Cloudflare's Sandbox pricing and assume an idle-most-of-the-time personal usage pattern. Your actual bill will vary.
| Resource | Provisioned | Monthly active usage | Free tier | Overage estimate |
|---|---|---|---|---|
| Sandbox container | 1 × standard-1 |
~30 min / day active | None | ~$1.50 / month |
| Durable Object | 1 | < 1 M requests | 1 M / month | $0 |
| Worker requests | 1 Worker | < 100 k / month | 10 M / month | $0 |
| LLM inference (BYOK) | Whatever you pick | You decide | N/A | Paid to provider directly |
The Sandbox container scales to zero after sleepAfter (default 4 hours). A sleeping container costs nothing. Wake-up takes ~10–30 seconds the first time, then a few seconds for subsequent wakes.
┌──────────────────────────────────────────────────────┐
request │ Cloudflare Worker ( src/index.ts ) │
─────────► │ ├─ /api/health, /v1/chat/completions, /api/... │
│ └─ optional dashboard hostname proxy │
└─────────┬────────────────────────────┬───────────────┘
│ Sandbox SDK │
│ containerFetch │ startProcess
▼ ▼
┌──────────────────────────────────────────────────────┐
│ Durable Object: HermesInstance │
│ └─ Cloudflare Sandbox container │
│ ├─ port 18789 → Hermes API server │
│ └─ port 9119 → Hermes native dashboard (web) │
└──────────────────────────────────────────────────────┘
The Worker is stateless. All Hermes state (sessions, crons, cached skills) lives inside ~/.hermes/ in the container and is preserved across sleeps by Cloudflare Sandbox's snapshot behaviour.
# 1. Clone and install
git clone https://github.com/PlaydaDev/hermesworkers.git
cd hermesworkers
npm install
# 2. Log into Cloudflare
npx wrangler login
# 3. Edit wrangler.toml — replace <YOUR_WORKER_NAME> and <YOUR_ACCOUNT_ID>.
# Run `npx wrangler whoami` to grab your account id.
# 4. Push at least one provider API key as a secret
npx wrangler secret put ANTHROPIC_API_KEY # or OPENROUTER_API_KEY / OPENAI_API_KEY
# 5. (Optional) Push a Worker-side bearer token to gate the API.
# Generate one with: openssl rand -hex 32
npx wrangler secret put API_TOKEN
# 6. Deploy (Docker Desktop must be running)
npx wrangler deployAfter the first deploy, the Worker prints its *.workers.dev URL. Smoke test:
WORKER_URL=https://<your-worker>.<your-account>.workers.dev
TOKEN=<the API_TOKEN you set, or empty if you skipped it>
curl -s "$WORKER_URL/api/health" \
-H "Authorization: Bearer $TOKEN" | jq
curl -s "$WORKER_URL/v1/chat/completions" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4-5",
"messages": [{"role": "user", "content": "Hello, Hermes."}],
"stream": false
}'The first request triggers a cold start — expect 15–60 seconds. Subsequent requests respond in normal API time.
| Method | Path | Description |
|---|---|---|
| GET | / |
Self-describing JSON (no auth) |
| GET | /api/health |
Liveness probe + Hermes gateway status |
| POST | /v1/chat/completions |
OpenAI-compatible chat (streaming supported) |
| POST | /api/instance/wake |
Boot the container without sending a chat message |
| POST | /api/instance/restart |
Hard restart (kills PID 1, Cloudflare respawns the image) |
| POST | /api/instance/restart-gateway |
Graceful Hermes process restart (re-reads BYOK secrets) |
| POST | /api/instance/stop |
Stop the Hermes processes (container stays alive) |
| GET | /api/instance/logs |
Dump process list, Hermes config, server log tail |
All /v1/* and /api/* paths are gated by API_TOKEN if you set it.
Hermes ships a built-in web dashboard (sessions, analytics, models, crons, skills). To make it reachable, wire a hostname under your control to the Worker:
- Pick a hostname, e.g.
hermes.example.com, and set it inwrangler.toml:[vars] DASHBOARD_HOSTNAME = "hermes.example.com"
- Add a proxied DNS record (CNAME) pointing the hostname at your Worker's route target.
- Add a Worker Route in
wrangler.toml:routes = [ { pattern = "hermes.example.com/*", custom_domain = true } ]
- Redeploy:
npx wrangler deploy
Visiting https://hermes.example.com now proxies straight to the Hermes native UI inside the container. WebSocket upgrades work transparently. If API_TOKEN is set, the dashboard hostname requires the same bearer token (or a hw_token=<value> cookie) before it responds.
See docs/custom-domain.md for a more detailed walk-through.
Hermes routes requests to providers based on the model id (e.g. anthropic/claude-sonnet-4-5 → Anthropic). For each provider you want to use, push the matching secret:
npx wrangler secret put ANTHROPIC_API_KEY
npx wrangler secret put OPENROUTER_API_KEY
npx wrangler secret put OPENAI_API_KEYThe Worker writes the secrets to ~/.hermes/.env inside the container on every boot, so:
- rotating a key only needs a
wrangler secret putfollowed byPOST /api/instance/restart-gateway, and - secrets never appear in the container's image layers.
See docs/byok-setup.md for provider-specific notes (model ids, gateways, rate limits).
The Cloudflare Sandbox keeps the container alive while it has open work, then suspends it after sleepAfter (default 4 hours) of inactivity. On suspend, in-memory state is checkpointed; on the next request, the container resumes within seconds.
This Worker does not boot the container at deploy time. The first chat (or POST /api/instance/wake) triggers startProcess('/usr/local/bin/start-hermes.sh', ...), which:
- Configures the Hermes API server (port 18789).
- Writes
~/.hermes/.envfrom the provider secrets supplied by the Worker. - Pins the default model from
HERMES_DEFAULT_MODEL(oranthropic/claude-sonnet-4-5). - Launches the native dashboard on port 9119 in the background.
- Execs
hermes gatewayin the foreground.
POST /api/instance/restart kills PID 1; Cloudflare respawns the container from the latest image (useful after wrangler deploy). POST /api/instance/restart-gateway only kills the Hermes processes; the container stays alive and re-reads ~/.hermes/.env on the next boot.
| Name | Required | Purpose |
|---|---|---|
ANTHROPIC_API_KEY |
¹ | Anthropic API key — written to ~/.hermes/.env |
OPENROUTER_API_KEY |
¹ | OpenRouter API key — written to ~/.hermes/.env |
OPENAI_API_KEY |
¹ | OpenAI API key — written to ~/.hermes/.env |
API_TOKEN |
No | Bearer token required on /v1/* and /api/* (recommended in production) |
HERMES_GATEWAY_TOKEN |
No | Bearer token used between the Worker and the Hermes API server (auto-default if omitted) |
HERMES_DEFAULT_MODEL |
No | Default model id (e.g. anthropic/claude-sonnet-4-5) |
DASHBOARD_HOSTNAME |
No | Hostname proxied to the Hermes native dashboard (see "Native dashboard") |
¹ At least one of the three provider keys is required.
Push secrets with npx wrangler secret put <NAME>. Plain config values (like DASHBOARD_HOSTNAME and HERMES_DEFAULT_MODEL) can also live under [vars] in wrangler.toml, but secrets must use wrangler secret put to stay out of the deployed bundle.
- Set
API_TOKEN. Without it, anyone who finds your*.workers.devURL can talk to your Hermes (and bill your provider key). The token is a single shared secret — rotate it withwrangler secret put API_TOKENfollowed byPOST /api/instance/restartif you suspect compromise. - The container is single-tenant. Anyone who can reach
/v1/chat/completionsreaches the same Hermes session/state. If you need multi-user separation, run multiple deployments. - Hermes' API server runs with
GATEWAY_ALLOW_ALL_USERS=trueso the Worker proxy can reach it. The Worker is the only gate — keepAPI_TOKENset in production. - Cloudflare AI Gateway is supported if you'd rather pay through Cloudflare than the upstream provider — set Hermes' OpenAI / Anthropic endpoint URLs accordingly via
hermes config setin a customstart-hermes.shoverride.
wrangler deploy complains the Docker CLI isn't available.
Start Docker Desktop (or docker context / WRANGLER_DOCKER_BIN-compatible alternative). Cloudflare builds the Sandbox image locally before pushing it.
Cold start hangs for several minutes.
The first build downloads ~3 GB of Hermes dependencies. Subsequent boots reuse the cached image. If the gateway never reaches port 18789, check GET /api/instance/logs for the tail of /tmp/hermes-server.log.
hermes config set model fails with requires an interactive terminal.
You are running the wrong subcommand. Use hermes config set model "<provider>/<model>" (note the literal key model), not hermes model (which is interactive).
Chat replies are empty (0 prompt + 0 completion tokens).
Likely the provider key is missing or wrong. Verify with GET /api/instance/logs that ~/.hermes/.env contains the expected entry, then POST /api/instance/restart-gateway.
The dashboard hostname returns 404 / SSL mismatch.
Confirm: (1) the hostname is set in wrangler.toml, (2) the Worker route is configured, (3) a proxied (orange-cloud) DNS record exists for that hostname, (4) it is covered by your Universal SSL or a custom certificate.
- PID file race on rapid boots. If multiple chat requests hit a cold container in parallel, Hermes may log
PID file race lost to another gateway instancefor the losing process(es). The winner serves traffic correctly. A singlewakerequest before opening the floodgates avoids the race. - Windows CRLF line endings. Cloning on Windows can mangle
container/start-hermes.sh. The Dockerfile strips CRs viased -i 's/\r$//', but make sure your editor saves shell scripts with LF endings.
Issues and pull requests are welcome. See CONTRIBUTING.md for the rules of the road.
- Nous Research for building Hermes.
- The Cloudflare team for the Sandbox SDK and the moltworker reference implementation that this project takes obvious inspiration from.
Apache License 2.0. See LICENSE.