orcha — Agent Team Orchestrator

A local/private orchestrator for supervising a team of AI agents working on software projects. The core question it answers is who is working on what, what do they need, what changed, and where is it running — not "here are many task logs".

Conceptual model (intentionally small)

objective ── sessions ── targets (where they run)
          ├─ workspaces (isolated / shared scratch / pr_branch)
          ├─ pull requests ── pr_feedback
          ├─ questions (needs-user)
          └─ artifacts (durable outputs, not stdout)

A manager is just a session with role=manager. Every objective starts with one. Work can produce zero, one, or many PRs and can run locally or on SSH targets.

Layout

Package	Responsibility
`internal/model`	Domain types + the state machines (session/objective transitions, target scheduling, PR push safety).
`internal/store`	SQLite (WAL) persistence. Normalized current-state tables; `events` is audit history. Dashboard queries return small rows; transcripts load separately by `seq` cursor. Atomic locks and target-slot claims.
`internal/agent`	The provider runtime contract (`start/send/cancel/resume/read_events`) + an in-process scriptable `FakeProvider`.
`internal/forge`	Git + code-host abstraction (push / open / refresh PR) with an in-memory `Fake`.
`internal/orch`	The orchestrator: scheduling, locks, loop guards, the session driver, steering, the PR workflow, the PR-feedback monitor, and the manager tool surface.
`internal/api`	HTTP API + incremental SSE streaming.
`cmd/orcha`	Wires it together with fakes and a dense dashboard; runnable with no external services.

Run

go run ./cmd/orcha            # serves :8080 with fake providers + fake forge
curl -X POST localhost:8080/api/objectives \
  -d '{"title":"Port subsystem to Rust","prompt":"...","agent":"claude"}'
open http://localhost:8080/   # dense dashboard

Safety invariants enforced (with tests)

No late completion resurrects canceled work — terminal session statuses have no outgoing transitions; the store validates transitions inside a transaction (store.UpdateSessionStatus, model.ValidateSessionTransition).
One writer per workspace / one updater per PR branch — row-unique locks.
Draining targets accept no new sessions; capacity is bounded — atomic ClaimTargetSlot.
Never push to a merged PR; a closed PR creates a manager decision — model.EvaluatePush + orch.UpdatePR.
Force push requires an explicit, recorded reason.
Loop guards (same-error, no-progress, per-session model-call cap) pause the session, record a compact reason, and open a question — instead of burning quota.
Usage exhaustion falls back to another provider or asks the user; it never retries an exhausted provider.
No long external op inside a DB transaction — model/forge/SSH calls happen between short store operations.
Dashboard stays small — dashboard queries select no transcript content; transcripts paginate via /api/sessions/:id/messages?after= and stream via /api/sessions/:id/stream.

Tests

go test ./...          # unit + integration
go test -race ./...    # the session driver is concurrent

The suite covers the spec's required unit and integration tests, including: objectives opening two independent PRs, PR feedback spawning a follow-up while another worker runs, interactive vs non-interactive steering, remote log streaming, remote cancel killing the process group, and the small-dashboard guarantee under large transcripts.

Backends

The orchestrator core (model, storage, scheduler primitives, locks, guards, PR workflow, feedback monitor, manager tools, HTTP API) runs against pluggable backends. Both fakes (for offline/tested flows) and real implementations exist:

Concern	Interface	Fake	Real
Agent runtime	`agent.Provider`	`agent.NewFake`	`agent.NewClaude` (interactive stream-json), `agent.NewCodex` (`codex exec`, resume-based steering), `agent.NewTmuxAgent` (interactive TUI in attachable tmux)
Terminal multiplexing	`tmux.Controller`	—	real `tmux` over local/SSH
Remote machines	`exec.SSHExecutor`	—	real `ssh` targets (register, health-check, bootstrap)
Execution location	`exec.Executor`	—	`exec.NewLocal` (process group), `exec.NewSSH` (`ssh -tt`)
Code host / VCS	`forge.Forge`	`forge.NewFake`	`forge.NewGit` (`git` + `gh`)
Workspace checkout	`workspace.Preparer`	(row only)	`workspace.New` (mirror cache + fresh-upstream clone)
Manager tools	`mcp` server	(call orch directly)	`Orchestrator.ManagerMCPHandler` (MCP over HTTP)

cmd/orcha flags: -fake-agents (offline agents) and -real-forge (git+gh forge and real workspace checkouts). Live backend tests are gated behind ORCHA_CLAUDE_LIVE, ORCHA_SSH_TEST_HOST, and ORCHA_GH_LIVE.

Auto-prep for spawned workers: when an objective carries a repo (repo/base_branch on the objective, or a per-spawn override on spawn_session), coding sessions the manager spawns get a fresh isolated checkout automatically. The checkout is prepared at start time on the session's placed target (so the clone and the agent run on the same machine), branched off the latest upstream. This closes the loop: objective → manager spawns worker → worker runs in a real checkout → publish_pr.

Workspace freshness: every prepared workspace is based on the latest upstream. A per-target bare mirror cache gives clone speed and build/cache locality, but the isolated checkout always re-fetches from the real origin and branches off the freshly-fetched base — never a stale local copy. PR follow-up workspaces check out the PR branch at its fresh head. Preparation runs through the exec.Executor, so it works identically on local and SSH targets.

Scheduler driver

orch.Scheduler makes the system self-driving: each tick it finds runnable queued/waiting-capacity sessions and starts them, respecting declared dependencies, a global concurrency cap, per-target capacity, workspace/PR-branch locks, and provider usage. It reacts promptly via a wake hook (SetNotify) when work is created or completes, falling back to an idle tick. Creating an objective therefore auto-starts its manager, which spawns workers the scheduler then runs — no manual restart. Dependencies that fail/cancel cancel their dependents; exhausted providers park the session and ask the user instead of respinning. Flags: -max-concurrent, -schedule-interval.

Manager tool-calling

Every objective starts with a manager session. The manager agent drives the orchestrator through tools exposed over MCP: Orchestrator.ManagerMCPHandler serves a minimal Streamable-HTTP MCP server (mounted at /mcp/<sessionID>), and each manager session's Claude is launched with --mcp-config pointing at its own endpoint (Config.ManagerMCPBaseURL). Tool calls — spawn_session, ask_user, publish_pr, update_pr, comment_pr, create_note, mark_objective_done, cancel_session — are scoped to the calling session and execute the corresponding orchestrator methods. Spawned workers are then auto-started by the scheduler.

Verified live (ORCHA_LIVE_MANAGER=1): a real Claude manager connects over HTTP and calls spawn_session, creating a worker in the orchestrator.

Remote (SSH) machines

Register an SSH box and sessions run there — agent, tmux pane, git checkout, and all. A target carries host/user/work_root (plus ssh_port/identity_file/ bootstrap in metadata); everything routes through exec.SSHExecutor (which wraps commands in ssh -tt), so:

the agent's tmux session runs on the remote host, attachable with ssh -t user@host tmux attach -t orcha-<id>,
workspace checkouts clone on the remote machine,
process-group cancellation works (kill the local ssh → remote SIGHUP).

Add one via POST /api/targets (or the dashboard's "add ssh machine" form):

curl -X POST localhost:8080/api/targets -H 'Content-Type: application/json' -d '{
  "name":"gpu-box","kind":"ssh","host":"1.2.3.4","user":"bot",
  "work_root":"/home/bot/work","capacity_sessions":4,
  "bootstrap":"mkdir -p /home/bot/work"
}'

Registration runs a doctor: one round trip over the executor that checks connectivity, tmux, git, a writable work root, at least one agent CLI (claude/codex), and gh (informational). The target only goes online if all required checks pass — so a reachable-but-under-provisioned box is caught at registration, not on the first session. The POST /api/targets response and the dashboard form both show exactly what's missing; re-run anytime with POST /api/targets/:id/doctor. Direct work to a box by pinning: spawn_session(..., target: "gpu-box"), or drain/disable to take it out of rotation. Requirements on the remote host: tmux, git, the agent CLIs and their auth (the doctor checks presence, not auth), plus key-based SSH (the executor uses BatchMode=yes). Verified locally; a live remote test is gated behind ORCHA_SSH_TEST_HOST.

Interactive tmux sessions (you can watch and take over)

Run with -tmux and every session becomes a real, attachable tmux session running the agent's interactive TUI (or a plain shell). agent.TmuxProvider (over tmux.Controller, which works on local and SSH targets):

starts a detached tmux new-session running the program in the workspace dir,
streams the pane via pipe-pane into the session transcript,
steers by typing into the live pane with send-keys (truly interactive),
cancels with kill-session,
and records the attach command on the session, so a human runs tmux attach -t orcha-<sessionID> (or ssh -t <host> tmux attach -t ... for SSH targets) to watch or take over live.

Verified against real tmux: interactive shell, send-keys steering, pipe-pane streaming, and kill-session cancellation.

A tmux manager session is launched with the same MCP/tool flags as the headless one (NewTmuxClaude), so an attachable manager can also drive the orchestrator via its tools. The live pane renders in the dashboard: the GET /api/sessions/:id/screen endpoint returns a capture-pane snapshot (via the agent.Snapshotter capability), which the dashboard polls into a terminal panel alongside the tmux attach command.

Interactivity: Claude vs Codex (headless mode)

The two agents differ in how steering works, and the UI reflects each session's mode (interactive vs noninteractive):

Claude has a real persistent mode (--input-format/--output-format stream-json), so a session is one long-lived process and steering is just another user message to its stdin — verified live including multi-turn.
Codex has no stable streaming-stdin equivalent; its only stable programmatic mode is the one-shot codex exec (true interactive access is the experimental app-server protocol). So Codex is non-interactive and steered via the spec's cancel/resume protocol — but resume preserves the conversation: Codex emits a thread_id on start (captured as provider_session_id), and a resumed turn uses codex exec resume <thread_id> so context carries over rather than restarting. Verified live (ORCHA_CODEX_LIVE=1): a resumed thread recalls earlier context.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
cmd/orcha		cmd/orcha
internal		internal
scripts		scripts
ui		ui
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

orcha — Agent Team Orchestrator

Conceptual model (intentionally small)

Layout

Run

Safety invariants enforced (with tests)

Tests

Backends

Scheduler driver

Manager tool-calling

Remote (SSH) machines

Interactive tmux sessions (you can watch and take over)

Interactivity: Claude vs Codex (headless mode)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

orcha — Agent Team Orchestrator

Conceptual model (intentionally small)

Layout

Run

Safety invariants enforced (with tests)

Tests

Backends

Scheduler driver

Manager tool-calling

Remote (SSH) machines

Interactive tmux sessions (you can watch and take over)

Interactivity: Claude vs Codex (headless mode)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages