Docs Β· Getting Started Β· Demo Catalog Β· Discord Β· X
Perstack is a containerized harness for agentic apps.
- Harness = Runtime + Config β Instructions, agent topology, and tools are defined in TOML β not wired in code. The runtime executes what you declare in config.
- Dev-to-prod in one container β Same image, same sandbox, same behavior from local to production.
- Full observability β Trace every delegation, token, and reasoning step. Replay any run from checkpoints.
Perstack draws clear boundaries β between your app and the harness, between the harness and each agent β so you can keep building without fighting the mess.
Perstack keeps expert definition, orchestration, and application integration as separate concerns.
create-expert scaffolds experts, the harness handles orchestration, and deployment stays simple because Perstack runs on standard container and serverless infrastructure.
To get started, use the built-in create-expert expert to scaffold your first agent:
# Use `create-expert` to scaffold a micro-agent team named `bash-gaming`
docker run --pull always --rm -it \
--env-file .env \
-v ./bash-gaming:/workspace \
perstack/perstack start create-expert \
--provider <provider> \
--model <model> \
"Form a team named bash-gaming. They build indie CLI games with both AI-facing non-interactive mode and human-facing TUI mode built on Ink + React. Their games must be runnable via npx at any time. Games are polished, well-tested with full playthroughs β TUI mode included."create-expert is a built-in expert. It defines a team of single-purpose micro-agents β called "experts" in Perstack.
create-expert : Thin coordinator that delegates to the experts
βββ @create-expert/plan : Expands the user's request into a comprehensive plan
βββ @create-expert/write : Produces perstack.toml from plan
βββ @create-expert/verify : Runs the expert with a test query and checks the completion
The full definition is available at definitions/create-expert/perstack.toml.
While create-expert is running, the TUI shows real-time status β active delegation tree, token usage, reasoning streams, and per-agent progress:
2026/03/13 08:15:40.083, @bash-gaming/build, β bash-gaming
β Reasoning
....
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Query: Form a team named bash-gaming. They build indie CLI games with bothβ¦
1 running Β· 3 waiting Β· 4m 06s Β· fireworks
Tokens: In 2.9M (Cached 2.1M, Cache Hit 72.69%) Β· Out 46.2k
βΈ create-expert Β· accounts/fireworks/models/kimi-k2p5 Β· β 2.4% Β· Waiting for delegates
β βΈ @create-expert/verify Β· accounts/fireworks/models/kimi-k2p5 Β· β 11.2% Β· Waiting for delegates
β βΈ bash-gaming Β· accounts/fireworks/models/kimi-k2p5 Β· β 6.4% Β· Waiting for delegates
β β @bash-gaming/build Β· accounts/fireworks/models/kimi-k2p5 Β· β 3.2% Β· Streaming Reasoning...
To run your experts on an actual task, use the perstack start command:
# Let `bash-gaming` build a Wizardry-like dungeon crawler
docker run --pull always --rm -it \
--env-file .env \
-v ./<result-dir>:/workspace \
-v ./bash-gaming/perstack.toml:/definitions/perstack.toml:ro \
perstack/perstack start bash-gaming \
--config /definitions/perstack.toml \
--provider <provider> \
--model <model> \
"Create a Wizardry-like dungeon crawler in a fixed 10-floor labyrinth with complex layouts, traps, fixed room encounters, and random battles. Include special-effect gear drops, leveling, and a skill tree for one playable character. Balance difficulty around build optimization. Death in the dungeon causes loss of one random equipped item."Here is an example game built with these commands: demo-catalog. Across 5 runs on 4 providers, the same experts and queries were used. 4 out of 5 runs produced a working dungeon crawler. Full run logs are included in the repository.
perstack log provides a TUI for browsing past runs and their delegation trees. Every delegation β who called whom, what succeeded, what failed β is visible at a glance:
$ npx perstack log --job <jobId>
Runs (create-expert) Enter:Select b:Back q:Quit
> β create-expert Form a team named bash-gaming. They build indie CLI games with both AI-faciβ¦
| \
| β @create-expert/plan Create a team named bash-gaming. They build indie CLI games with boβ¦
| /
β create-expert (resumed)
| \
| β @create-expert/write Create perstack.toml at /workspace/plan.md. This is a new team creβ¦
| /
β create-expert (resumed)
| \
| β @create-expert/verify Verify perstack.toml at /workspace/perstack.toml against plan at β¦
| | \
| | β bash-gaming Create a CLI word guessing game called 'cryptoword' published as @bash-gaβ¦
| | | \
| | | β @bash-gaming/plan Create a CLI word guessing game 'cryptoword' published as @bash-gβ¦
| | | /
| | β bash-gaming (resumed)
| | | \
| | | β @bash-gaming/build Implement the complete cryptoword game package at /home/perstackβ¦
| | | /
| | β bash-gaming (resumed)
| | | \
| | | β @bash-gaming/verify Verify the cryptoword package at /home/perstack/cryptoword/: 1β¦
| | | /
| | β bash-gaming (resumed)
| | /
| β @create-expert/verify (resumed)
| /
β create-expert (resumed)Perstack separates the agent harness from the application layer. Your app stays a normal web or terminal app, with no LLM dependencies in the client.
βββββββββββββββββββ ββββββββββββββββββββ
β Your app β events β perstack run β
β (React, TUIβ¦) β ββββββββββββ β (@perstack/ β
β β SSE / WS / β runtime) β
β @perstack/ β any stream β β
β react β β β
βββββββββββββββββββ ββββββββββββββββββββ
Frontend Server
Swap models, change agent topology, or scale the harness β without touching application code. @perstack/react provides hooks (useJobStream, useRun) that turn the event stream into React state. See the documentation for details.
FROM perstack/perstack:latest
# Install extra dependencies and configure a non-root user here if needed:
# RUN apt-get update && apt-get install -y --no-install-recommends git && rm -rf /var/lib/apt/lists/*
# RUN useradd -m agent
# USER agent
COPY perstack.toml .
RUN perstack install
ENTRYPOINT ["perstack", "run", "my-expert"]The image is Ubuntu-based, multi-arch (linux/amd64, linux/arm64), and is ~74 MB. perstack install pre-resolves MCP servers and prewarms tool definitions for faster, reproducible startups. The runtime can also be imported directly as a TypeScript library (@perstack/runtime) for serverless environments. See the deployment guide for details.
- Docker
- An LLM provider API key (see Providers and Models)
There are two ways to provide API keys:
1. Pass host environment variables with -e
Export the key on the host and forward it to the container:
export FIREWORKS_API_KEY=fw_...
docker run --rm -it \
-e FIREWORKS_API_KEY \
-v ./workspace:/workspace \
perstack/perstack start my-expert "query" --provider fireworks2. Store keys in a .env file in the workspace
Create a .env file in the workspace directory. Perstack loads .env and .env.local by default:
# ./workspace/.env
FIREWORKS_API_KEY=fw_...docker run --rm -it \
-v ./workspace:/workspace \
perstack/perstack start my-expert "query"You can also specify custom .env file paths with --env-path:
perstack start my-expert "query" --env-path .env.productionThree principles guide how Perstack approaches agentic app development:
- Quality is a system property, not a model property: Building agentic apps people actually use doesn't require an AI science degreeβjust a solid understanding of the problems you're solving.
- Keep your app simple and reliable: The harness is inevitably complexβPerstack absorbs that complexity so your agentic app doesn't have to.
- Do big things with small models: If a smaller model can do the job, there's no reason to use a bigger one.
Perstack introduces micro-agents β a multi-agent orchestration design built around purpose-specific agents, each with a single responsibility.
- Simple: A monolithic agent assembles its system prompt from hundreds of fragments. A multi-agent framework stacks abstraction layers and wires orchestration in code. A Perstack expert is one TOML section β instruction, delegates, done.
- Reliable: A plan agent that only plans, a build agent that only builds, a verify agent that only verifies β the pipeline structure itself prevents shortcuts and catches errors that a single generalist would miss.
- Reusable: Delegates are dependency management for agents β like npm packages or crates. Separate concerns through delegate chains, and compose purpose-built experts across different projects.
Perstack ships a five-layer stack that gives micro-agents everything they need to run.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ-β
β Interface β
β CLI Β· Event streaming Β· Programmatic API β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ-β€
β Runtime β
β Agentic loop Β· Event-sourcing Β· Checkpointing Β· Tool use β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ-β€
β Context β
β System prompts Β· Prompt caching Β· AgenticRAG Β· Extended thinking β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ-β€
β Definition β
β Multi-agent topology Β· MCP skills Β· Provider abstraction β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ-β€
β Infrastructure β
β Sandbox isolation Β· Workspace boundary Β· Secret management β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ-β
Full feature matrix
| Layer | Feature | Description |
|---|---|---|
| Definition | perstack.toml |
Declarative project config with global defaults (model, reasoning budget, retries, timeout) |
| Expert definitions | Instruction, description, delegates, tags, version, and minimum runtime version per expert | |
| Skill types | MCP stdio, MCP SSE, and interactive skills with tool pick/omit filtering and domain restrictions | |
| Provider config | 9 providers (Anthropic, OpenAI, Google, Fireworks, DeepSeek, Ollama, Azure OpenAI, Amazon Bedrock, Google Vertex) with per-provider settings | |
| Model tiers | Provider-aware model selection via defaultModelTier (low / middle / high) with fallback cascade |
|
| Provider tools | Provider-native capabilities (web search, code execution, image generation, etc.) with per-tool options | |
| Lockfile | perstack.lock β resolved snapshot of experts and tool definitions for reproducible deployments |
|
| Context | Meta-prompts | Role-specific system prompts (coordinator vs. delegate) with environment injection (time, working directory, sandbox) |
| Context window tracking | Per-model context window lookup with usage ratio monitoring | |
| Message types | Instruction, user, expert, and tool messages with text, image, file, thinking, and tool-call parts | |
| Prompt caching | Provider-specific cache control with cache-hit tracking | |
| Delegation | Parallel child runs with isolated context, parent history preservation, and result aggregation | |
| Extended thinking | Provider-specific reasoning budgets (Anthropic thinking, OpenAI reasoning effort, Google thinking config) | |
| Token usage | Input, output, reasoning, cached, and total token tracking accumulated across steps and delegations | |
| Resume / continue | Resume from any checkpoint, specific job, or delegation stop point | |
| Runtime | State machine | 9-state machine (init β generate β call tools β resolve β finish, with delegation and interactive stops) |
| Event-sourcing | 21 run events, 6 streaming events, and 5 runtime events for full execution observability | |
| Checkpoints | Immutable state snapshots with messages, usage, pending tool calls, and delegation metadata | |
| Skill manager | Dynamic skill lifecycle β connect, discover tools, execute, disconnect β with adapter pattern | |
| Tool execution | Parallel MCP tool calls with priority classification (MCP β delegate β interactive) | |
| Error handling | Configurable retries with provider-specific error normalization and retryability detection | |
| Job hierarchy | Job β run β checkpoint structure with step continuity across delegations | |
| Streaming | Real-time reasoning and result deltas via streaming callbacks | |
| Infrastructure | Container isolation | Docker image (Ubuntu, multi-arch, ~74 MB) with PERSTACK_SANDBOX=1 marker |
| Workspace boundaries | Path validation with symlink resolution to prevent traversal and escape attacks | |
| Env / secrets | .env loading with --env-path, requiredEnv minimal-privilege filtering, and protected-variable blocklist |
|
| Exec protection | Filtered environment for subprocesses blocking LD_PRELOAD, NODE_OPTIONS, and similar vectors |
|
| Install & lockfile | perstack install pre-resolves tool definitions for faster, reproducible startup |
|
| Interface | perstack CLI |
start (interactive TUI), run (JSON events), log (history query), install, and expert management commands |
| TUI | React/Ink terminal UI with real-time activity log, token metrics, delegation tree, and job/checkpoint browser | |
| JSON event stream | Machine-readable event output via perstack run with --filter for programmatic integration |
|
@perstack/runtime |
TypeScript library for serverless and custom apps β run() with event listener, checkpoint storage callbacks |
|
@perstack/react |
React hooks (useRun, useJobStream) and event-to-activity processing utilities |
|
| Studio | Expert lifecycle management β create, push, version, publish, yank β via Perstack API | |
| Log system | Query execution history by job, run, step, or event type with terminal and JSON formatters |
| Topic | Link |
|---|---|
| Getting started | Getting Started |
| Architecture and core concepts | Understanding Perstack |
| Skills | Skills |
| Base skill (built-in tools) | Base Skill |
| Adding tools via MCP | Extending with Tools |
| Deployment | Deployment |
| Providers and models | Providers and Models |
| CLI reference | CLI Reference |
perstack.toml reference |
perstack.toml Reference |
| Events reference | Events Reference |
| API reference | API Reference |
demo-catalog runs the same experts and queries across multiple providers and models. Every run includes raw checkpoints and event logs β fully traceable, replayable, and ready for your own analysis. New demos and provider results are added continuously.
See CONTRIBUTING.md.
Apache License 2.0 β see LICENSE for details.