Skip to content
danieljhkim edited this page May 17, 2026 · 2 revisions

Orbit User Guide

Orbit is the engineering framework for AI coding agents. It brings the disciplines that keep code maintainable — tasks, decision records, structured audit, conflict-aware parallel execution — into the agent-driven workflow without making them expensive.

Orbit sits above your agent CLIs (Claude Code, Codex CLI, Gemini CLI, or any OpenAI-compatible endpoint). Agents still do the coding; Orbit makes sure every change is intentional, reviewable, and traceable months later.


Why Orbit Exists

Without Orbit, an engineer prompting agents repeatedly ends up with:

  • No durable record of why a change was made
  • No acceptance criteria, so validation is ad-hoc
  • Merge conflicts when two agents edit the same files
  • No way to reconstruct the decision process six months later
  • Audit gaps that make incidents painful to investigate

Orbit enforces the good habits by default while making them cheap:

  • Every piece of work starts as a Task
  • Load-bearing decisions are captured as ADRs
  • Every tool call and provider exchange is recorded in a queryable Audit Log
  • Parallel agent runs are isolated in git worktrees with file-level locks
  • A local Knowledge Graph lets agents answer "who calls this function?" instead of grepping

The result: you can safely run multiple agents against the same codebase in parallel and still ship clean, auditable code.


Core Concepts

Concept What it is Why it matters
Task Durable intent record (ORB-00042) with description, acceptance criteria, plan, execution summary, review threads, and artifacts Survives sessions, branches, and time. Every commit carries the task ID.
ADR Architecture Decision Record with proposed → accepted → superseded lifecycle Captures the "why" for load-bearing choices so future agents and humans don't have to reverse-engineer it.
Design Doc Numbered folder under docs/design/<feature>/ (1_overview.md, 2_design.md, 3_vision.md, 4_decisions.md) Living documentation that is lint-checked for staleness against the code it describes.
Learning Scoped, tagged project knowledge (orbit.learning.*) Push-based knowledge for agents — they don't have to search a wiki; relevant learnings are surfaced automatically.
Friction Captured operational pain with tags and resolution status Makes self-reported tooling friction first-class so the team can systematically remove it.
Knowledge Graph Parsed symbol-level graph (callers, implementors, refs, history) across Rust/Go/JS/TS/Python Agents query real code structure instead of text search.
Audit Log Append-only, structured events for every tool call, provider I/O, and state transition Complete, tamper-evident history with agent identity attached.
Worktree + Locks Each run ship execution gets its own git worktree; file locks are reserved before agents start Prevents two agents from editing the same files and producing unresolvable merge conflicts.

Installation

Recommended: Clone + make install

This gives you the full, customizable framework (not just a binary).

# 1. Clone Orbit (choose a location you control)
git clone https://github.com/danieljhkim/orbit ~/code/orbit
cd ~/code/orbit

# 2. Build and install
make install          # respects $INSTALL_BIN_DIR (default: ~/.cargo/bin)

# 3. Verify
orbit --version       # should print 0.5.4 or later

Alternative: curl installer or Homebrew

curl -sSf https://raw.githubusercontent.com/danieljhkim/orbit/main/install.sh | sh

# or
brew install danieljhkim/tap/orbit

Claude Code Plugin (limited surface)

/plugin marketplace add danieljhkim/orbit
/plugin install orbit

Plugin vs CLI differences (important):

  • Plugin binary lives inside the Claude Code sandbox (not on $PATH)
  • No orbit web serve dashboard
  • No orbit run ship (workflows require the full CLI)
  • Works only with Claude Code (not Codex or Gemini CLI)

Use the CLI for real engineering work.


First-Time Setup (Human Walkthrough)

1. Global initialization

orbit init

This creates ~/.orbit/ — the canonical home for:

  • All task bundles (~/.orbit/tasks/workspaces/<id>/)
  • Global skills
  • Semantic embedding companion + models
  • Cross-workspace configuration

You only run this once per machine (or after deleting ~/.orbit).

2. Per-workspace initialization

From inside the repository where you want to use Orbit:

orbit workspace init --mcp

What this does:

  • Creates .orbit/ at the repo root (your workspace-local state)
  • Registers the Orbit MCP server with your local agent CLIs (Claude Code, Codex, Gemini)
  • Seeds default skills under ~/.orbit/skills/
  • Builds the initial knowledge graph

Important files created:

  • .orbit/config.yaml — contains the stable workspace_id
  • .orbit/resources/ — activity and job definitions (customizable)
  • .orbit/state/audit/ — append-only audit events for this workspace

3. Verify everything works

orbit workspace show
orbit task list
orbit graph overview

You should see your workspace registered and an empty task list.


The Task Lifecycle

proposed ──approve──▶ backlog ──start──▶ in-progress ──(work)──▶ review ──approve──▶ done
     │                                    │
     │                                    └── reject ──▶ rejected
     │
     └── reject ──▶ rejected

States (visible in orbit task list --status and orbit task show):

  • proposed — Task exists but has not been approved for work
  • backlog — Approved, ready to be picked up
  • in-progress — An agent or human has started execution
  • review — Work complete; review threads are open
  • done — All review threads resolved and task approved
  • rejected / archived — Terminal states

Creating a task (two ways)

Via CLI (precise):

TASK_ID=$(orbit task add \
  --title "Add rate limiting to the auth endpoint" \
  --description "..." \
  --acceptance-criteria "- [ ] Returns 429 on > 100 req/min per IP\n- [ ] ..." \
  --workspace .)

Via agent (recommended for complex work): Just say to your agent:

"Create an Orbit task for adding rate limiting to the auth endpoint. Include clear acceptance criteria as checkboxes."

The agent will call orbit.task.add through MCP.

Typical human flow for a task

# 1. See what's proposed or in backlog
orbit task list --status proposed,backlog

# 2. Inspect a task in detail (this is your primary debugging tool)
orbit task show ORB-00042

# 3. Approve it for work
orbit task approve ORB-00042

# 4. (Optional) Mark it started
orbit task start ORB-00042

# 5. Launch the dashboard to watch execution live
orbit web serve

Shipping Work: orbit run ship

This is the primary workflow for getting tasks implemented by agents.

# Ship everything ready in the backlog (auto mode)
orbit run ship

# Ship specific tasks
orbit run ship ORB-00042 ORB-00043

# Local mode (no PRs, just worktree + merge locally)
orbit run ship --mode local

What run ship actually does (the gated pipeline):

  1. Discovers eligible tasks (backlog or explicitly passed)
  2. For each task, reserves file locks on its declared context_files
  3. If any lock conflicts with another active run → the run is rejected before agents start (no wasted work, no merge hell later)
  4. Each approved task gets its own git worktree under .orbit/state/worktrees/
  5. The task's assigned Job (usually task_auto_pipeline) is executed:
    • Agent is spawned (via backend: cli by default)
    • Agent receives the full task context via MCP tools
    • Agent edits code inside its private worktree
    • On success, the changes are collected
  6. In --mode pr (default): a PR is opened against the base branch with the task ID in the title/body
  7. In --mode local: changes are merged into your current branch (use with caution)

Inspect what happened:

orbit run history -j task_auto_pipeline
orbit run show <run-id>
orbit run events <run-id>
orbit run logs <run-id>

The Web Dashboard

orbit web serve

Opens a local web UI (default port 8080) showing:

  • Live task backlog with status
  • Per-agent scoreboard (duel-plan, PR throughput, review outcomes)
  • Real-time audit log as agents work
  • Worktree status and active locks
  • Job run history and traces

This is the best way to monitor parallel agent fleets.


Knowledge Graph

Orbit maintains a parsed, content-addressed graph of your codebase.

# Rebuild from scratch (slow first time on large repos)
orbit graph build

# Incremental update (normal usage)
orbit graph update

# Query examples
orbit graph search "rate_limit"
orbit graph show "src/auth.rs:42"
orbit graph callers "auth::check_rate_limit"
orbit graph implementors "RateLimiter"

Supported languages: Rust, Go, JavaScript/TypeScript, Python, Java (partial).

The graph is what makes orbit.graph.* MCP tools powerful for agents — they can ask "who calls this?" instead of guessing from grep.


Semantic Search (optional but powerful)

# One-time setup (downloads ~150MB companion + model)
orbit semantic install

# Backfill existing tasks
orbit semantic reindex

# Search
orbit semantic search "race condition in the lock scheduler"
orbit semantic related ORB-00042

After install, every task write is automatically embedded in the background. Use this when you remember the intent of a past task but not the ID or exact title.


Architecture Decision Records (orbit adr)

# Create a new ADR
orbit adr add --title "Use worktree-per-task for parallel execution"

# List and inspect
orbit adr list --status proposed
orbit adr show ADR-0017

# Accept after implementation ships (requires task ID)
orbit adr update ADR-0017 --status accepted --task ORB-00042

# Supersede an old decision
orbit adr supersede ADR-0009 --by ADR-0023

ADRs live under .orbit/adrs/ (proposed/, accepted/, superseded/) and are not gitignored by default — they travel with the repo.


Design Documentation (orbit design)

Orbit enforces a strict four-file layout for every feature design:

docs/design/<feature>/
├── 1_overview.md
├── 2_design.md
├── 3_vision.md
├── 4_decisions.md
├── specs/
└── references/glossary.md
# Scaffold a new feature design folder
orbit design init my-new-feature

# Check for stale docs (Last updated date vs code mtime)
orbit design check

The make check-design-docs target in the Orbit repo itself uses this.


Project Learnings & Operational Friction

Learnings are push-based knowledge for agents:

orbit learning add --title "..." --tag rust,performance --scope "crates/orbit-engine/**"
orbit learning list --tag rust

Frictions capture things that hurt when using Orbit:

orbit friction add --title "MCP tool discovery is slow on first run" --tag mcp,performance
orbit friction resolve FR-0003
orbit friction stats

These surfaces exist so that recurring pain becomes visible data instead of Slack messages that disappear.


Review Threads

When an agent (or human) finishes work on a task, it opens review threads instead of just merging.

orbit task review-thread list ORB-00042
orbit task review-thread reply <thread-id> --body "..."
orbit task review-thread resolve <thread-id>

Only after all threads are resolved can the task be approved to done.

This is the structured alternative to "just look at the PR diff."


Sandboxing & Parallel Execution

  • On macOS: agent subprocesses run under sandbox-exec with an FsProfile that restricts filesystem access to the worktree only.
  • On Linux/Windows: in-process guards + the git worktree isolation still apply (no OS-level sandbox yet).
  • Network egress is policy-gated per activity.
  • File locks (orbit task locks) are the real safety net for parallel run ship — they fail fast on overlap instead of producing merge conflicts later.

.orbit Directory Layout (What Lives Where)

.orbit/                          # workspace-local (safe to delete → clean slate)
├── config.yaml                  # workspace_id + config
├── tasks/                       # symlinks → ~/.orbit/tasks/workspaces/<id>/
├── adrs/                        # proposed/, accepted/, superseded/
├── learnings/                   # your team's durable knowledge
├── frictions/                   # local friction log + tags.yaml
├── knowledge/                   # parsed graph artifacts
├── resources/                   # activities, jobs, executors, policies (customizable)
└── state/
    ├── audit/                   # append-only JSONL events
    ├── job-runs/                # per-run metadata + step traces
    ├── worktrees/               # live git worktrees for agent runs
    ├── logs/                    # captured agent stdout/stderr
    └── scoreboard/              # rolling counters (PRs, reviews, etc.)

~/.orbit/                        # global (machine-level, survives repo moves)
├── tasks/
│   ├── index.sqlite             # authority for ORB-XXXXX IDs
│   └── workspaces/<workspace-id>/<task-id>/   # canonical task bundles
├── skills/                      # SKILL.md files (routable via MCP)
├── embed/                       # semantic companion binary + models
└── config.toml                  # global settings

Gitignore recommendation:

.orbit/*
!.orbit/adrs/
!.orbit/learnings/

Keep project memory in git; keep runtime execution state out.


Common Command Cheat Sheet

# Daily
orbit task list --status backlog,in-progress,review
orbit task show ORB-00042
orbit run history --limit 20
orbit web serve

# Graph & search
orbit graph search "function_name"
orbit semantic search "the thing about rate limiting"

# Audit & debugging
orbit audit list --type tool_call --limit 50
orbit run show <run-id> --json
orbit run trace <run-id>

# Maintenance
orbit graph build
orbit semantic reindex
orbit skill doctor

Full reference: orbit <subcommand> --help for every command.


Getting Help Inside Your Agent

Once orbit workspace init --mcp has been run, your agents have access to the full orbit.* tool surface. Just ask in natural language:

"Use orbit to find all callers of validate_token and create a task for adding logging."

The orbit skill routes the request to the correct orbit.* tool.


Current Status (v0.5.x)

Orbit is under active development. Core functionality (tasks, ADRs, graph, audit, run ship, MCP, sandboxing on macOS, web dashboard) is stable enough for daily personal use by the authors.

Roadmap items still evolving:

  • More language support in the knowledge graph
  • Linux/Windows OS-level sandboxing
  • backend: http agent loops (currently backend: cli is primary)
  • Groundhog distributed execution preview

This wiki page is the recommended starting point for human users. The README in the repo is intentionally short because the real depth lives here and in the design docs under docs/design/.