Run AI coding agents inside sandboxed Linux VMs. The agent gets full autonomy while your host system stays safe.
Uses Lima to create lightweight Debian VMs on macOS and Linux. Ships with dev tools, Docker, and a headless Chrome browser with Chrome DevTools MCP pre-configured.
Currently supports Claude Code. Other agents (Codex, etc.) can be added in the future.
Feedbacks welcome!
- macOS or Linux
- Lima (installed automatically via Homebrew if available)
- A Claude subscription (Pro, Max, or Team)
git clone https://github.com/sylvinus/agent-vm.git
cd agent-vm
# Add to your shell config
echo "source $(pwd)/claude-vm.sh" >> ~/.zshrc # zsh
echo "source $(pwd)/claude-vm.sh" >> ~/.bashrc # or bashclaude-vm-setupCreates a VM template with dev tools, Docker, Chromium, and Claude Code pre-installed. During setup, Claude will launch once for authentication. After it responds, type /exit to continue with the rest of the setup. (We haven't found a way to automate this step yet.)
Options:
| Flag | Description | Default |
|---|---|---|
--minimal |
Only install git, curl, jq, and Claude Code. Skips Docker, Node.js, Python, Chromium, and the Chrome MCP server. | off |
--disk GB |
VM disk size in GB | 20 |
--memory GB |
VM memory in GB | 8 |
claude-vm-setup --minimal # Lightweight VM with just Claude
claude-vm-setup --disk 50 --memory 16 # Larger VM for heavy workloadscd your-project
claude-vmClones the template into a fresh VM, mounts your current directory, and runs claude --dangerously-skip-permissions with IS_SANDBOX=1 to suppress the dangerous mode confirmation prompt (the VM itself is the sandbox). The VM is deleted when Claude exits.
Any arguments passed to claude-vm are forwarded to the claude command:
claude-vm -p "fix all lint errors" # Run with a prompt
claude-vm --resume # Resume previous session
claude-vm -c "explain this codebase" # Continue conversationclaude-vm-shellSame as claude-vm but drops you into a bash shell instead.
Create this file in your home directory to install extra tools into the VM template. It runs once during claude-vm-setup, as the default VM user (with sudo available):
# ~/.claude-vm.setup.sh
sudo apt-get install -y postgresql-client
pip install pandas numpyCreate this file at the root of any project. It runs inside the cloned VM each time you call claude-vm, just before Claude starts. Use it for project-specific setup like installing dependencies or starting services:
# your-project/.claude-vm.runtime.sh
npm install
docker compose up -dWhen you run claude-vm --github inside a git repo with a GitHub remote, it automatically:
- Detects the repository from
git remote - Obtains a repo-scoped GitHub token via the device flow (browser-based OAuth)
- Starts two host-side proxies: one for the GitHub MCP Server, one for Git HTTP
- Configures the VM so both
git push/pulland MCP tools work transparently - Writes instructions to
~/.claude/CLAUDE.mdin the VM so Claude knows git is available
No credentials are ever exposed to the VM. Both proxies inject the token on the host side and enforce repo scope.
Tokens are generated via a GitHub App using the device flow:
- One-time setup: Create a GitHub App with
contents: writepermission and install it on your org/account. The App's Client ID is configured inclaude-vm.sh. - Per-session:
github_app_token_demo.pyinitiates the device flow — you approve in a browser, and a user access token is returned. - Repo scoping: The
--repoflag resolves the repository's numeric ID and passes it asrepository_idduring the OAuth token exchange. GitHub scopes the resulting token to that single repository at the API level. - Caching: Tokens are cached in
~/.cache/claude-vm/and automatically refreshed when expired, so you only need to re-authorize when the refresh token expires.
The Git HTTP proxy (github-git-proxy.py) lets the VM push and pull via standard git commands without SSH keys or tokens in the VM.
How it works:
- The proxy runs on the host, listening on HTTP
claude-vm.shconfigures git'surl.<proxy>.insteadOfin the VM's~/.gitconfigto rewrite the repo's SSH and HTTPS URLs through the proxy- The proxy injects Basic auth (
x-access-token:TOKEN) for requests matching the configured repo - Requests for other repos are forwarded without credentials (they fail auth on GitHub's side)
- The host's git
user.nameanduser.emailare copied into the VM
Configuration:
| Env var | Description | Default |
|---|---|---|
GITHUB_MCP_TOKEN |
GitHub token (required) | — |
GITHUB_MCP_OWNER |
Repository owner (required) | — |
GITHUB_MCP_REPO |
Repository name (required) | — |
GITHUB_GIT_PROXY_DEBUG |
Set to 1 for verbose logging |
0 |
GITHUB_GIT_PROXY_LOG_DIR |
Directory for log file | . |
The GitHub MCP proxy (github-mcp-proxy.py) gives the VM access to GitHub's MCP Server for issues, PRs, code search, and other API operations.
Defense-in-depth:
Even though the token is already scoped to one repository by GitHub, the proxy adds multiple enforcement layers:
| Layer | Mechanism |
|---|---|
| Owner/repo check | Tool arguments with owner/repo must match the configured repo. Missing values are auto-injected. |
| Search query scoping | repo:OWNER/REPO is injected into search queries. org: and user: qualifiers are rejected. |
| Tool allowlist | Unknown tools are blocked by default (default-deny). Non-repo-scoped tools (search_users, get_teams, etc.) are blocked. |
| Server-side filtering | X-MCP-Toolsets header limits GitHub's server to repos,issues,pull_requests,git,labels by default. |
| Lockdown mode | X-MCP-Lockdown is enabled by default, hiding issue details from users without push access. |
| Header protection | VM cannot override X-MCP-* headers — the proxy strips them before injecting host-configured values. |
Configuration:
| Env var | Description | Default |
|---|---|---|
GITHUB_MCP_TOKEN |
GitHub token (required) | — |
GITHUB_MCP_OWNER |
Repository owner (required) | — |
GITHUB_MCP_REPO |
Repository name (required) | — |
GITHUB_MCP_TOOLSETS |
Comma-separated toolsets | repos,issues,pull_requests,git,labels |
GITHUB_MCP_TOOLS |
Comma-separated tool names (fine-grained) | (all in allowed toolsets) |
GITHUB_MCP_READONLY |
Set to 1 for read-only mode |
0 |
GITHUB_MCP_LOCKDOWN |
Set to 0 to disable lockdown |
1 |
GITHUB_MCP_PROXY_DEBUG |
Set to 1 for verbose logging |
0 |
Both proxies can be run independently of claude-vm:
# MCP proxy
GITHUB_MCP_TOKEN=ghu_... GITHUB_MCP_OWNER=myorg GITHUB_MCP_REPO=myrepo \
python3 github-mcp-proxy.py
# Git HTTP proxy
GITHUB_MCP_TOKEN=ghu_... GITHUB_MCP_OWNER=myorg GITHUB_MCP_REPO=myrepo \
python3 github-git-proxy.py
# Both print the listening port to stdoutclaude-vm-setupcreates a Debian 13 VM with Lima, installs dev tools + Chrome + Claude Code, and stops it as a reusable templateclaude-vm [args]clones the template, mounts your working directory read-write, runs optional.claude-vm.runtime.sh, then launches Claude with full permissions (forwarding any arguments to theclaudecommand)- On exit, the cloned VM is stopped and deleted. The template persists for reuse
Ports opened inside the VM (e.g. by Docker containers) are automatically forwarded to your host by Lima.
| Category | Packages |
|---|---|
| Core | git, curl, wget, build-essential, jq |
| Python | python3, pip, venv |
| Node.js | Node.js 22 (via NodeSource) |
| Search | ripgrep, fd-find |
| Browser | Chromium (headless), xvfb |
| Containers | Docker Engine, Docker Compose |
| AI | Claude Code, Chrome DevTools MCP server |
Running an AI agent with full permissions is powerful but risky. Here's how the options compare:
| No sandbox | Docker | VM (agent-vm) | |
|---|---|---|---|
| Agent can run any command | Yes | Yes | Yes |
| File system isolation | None | Partial (shared kernel) | Full |
| Network isolation | None | Partial | Full |
| Can run Docker inside | Yes | Requires DinD or socket mount | Yes (native) |
| Kernel-level isolation | None | None (shares host kernel) | Full (separate kernel) |
| Protection from container escapes | None | None | Yes |
| Browser / GUI tools | Host only | Complex setup | Built-in (headless Chromium) |
Docker containers share the host kernel, so a motivated agent could exploit kernel vulnerabilities or misconfigurations to escape. A VM runs its own kernel — even if the agent gains root inside the VM, it can't reach the host.
A VM also avoids the practical headaches of Docker sandboxing. Docker runs natively inside the VM without Docker-in-Docker hacks or socket mounts. Headless Chromium works out of the box without fiddling with --no-sandbox flags or shared memory settings. Lima automatically forwards ports from the VM to your host, so if the agent starts a server on port 3000, it's immediately accessible at localhost:3000. The agent gets a normal Linux environment where everything just works.
Finally, using a VM means you don't need Node.js, npm, Docker, or any other dev tooling installed on your host machine. The only host dependency is Lima. All the tools (and their vulnerabilities) live inside the VM.
For AI agents running with --dangerously-skip-permissions, a VM is the only sandbox that provides meaningful security.
MIT