Run AI coding agents inside sandboxed Linux VMs. The agent gets full autonomy while your host system stays safe.
Uses Lima to create lightweight Debian VMs on macOS and Linux. Ships with dev tools, Docker, and a headless Chrome browser with Chrome DevTools MCP pre-configured.
Supports Claude Code, OpenCode, and Codex CLI out of the box. Other agents can be run via agent-vm shell.
Never install attack vectors such as npm, claude or even Docker on your host machine again!
Feedback welcome!
- macOS or Linux
- Lima (installed automatically via Homebrew if available)
- A subscription or API key for your agent of choice
git clone https://github.com/sylvinus/agent-vm.git
cd agent-vm
# Add to your shell config
echo "source $(pwd)/agent-vm.sh" >> ~/.zshrc # zsh
echo "source $(pwd)/agent-vm.sh" >> ~/.bashrc # or bashagent-vm setupCreates a base VM template with dev tools, Docker, Chromium, and AI coding agents pre-installed.
Options:
| Flag | Description | Default |
|---|---|---|
--disk GB |
VM disk size in GB | 20 |
--memory GB |
VM memory in GB | 8 |
--cpus N |
Number of CPUs | 4 |
agent-vm setup --disk 50 --memory 16 --cpus 8 # Larger VM for heavy workloadscd your-project
agent-vm claude # Claude Code
agent-vm opencode # OpenCode
agent-vm codex # Codex CLICreates a persistent VM for the current directory (or reuses it if one already exists), mounts your working directory, and runs the agent with full permissions. The VM persists after the agent exits so you can reconnect later. Ports opened inside the VM (e.g. by Docker containers or dev servers) are automatically forwarded to your host by Lima.
Each agent runs with its respective auto-approve flag:
clauderuns with--dangerously-skip-permissionsopencodedoes not yet have an auto-approve flag (waiting on this PR)codexruns with--full-auto
Any extra arguments are forwarded to the agent command:
agent-vm claude -p "fix all lint errors" # Run with a prompt
agent-vm claude --resume # Resume previous session
agent-vm opencode -p "refactor auth module" # OpenCode with a prompt
agent-vm codex -q "explain this codebase" # Codex with a queryagent-vm shell # Open a zsh shell in the VM
agent-vm run npm install # Run a one-off command in the VM
agent-vm run docker compose up -d # Start servicesEach directory gets its own persistent VM. You can manage it with:
agent-vm status # Show VM status for the current directory
agent-vm stop # Stop the VM (can be restarted later)
agent-vm destroy # Stop and permanently delete the VM
agent-vm destroy-all # Stop and delete all agent-vm VMsTo resize an existing VM's disk or memory, just pass --disk or --memory again — the VM will be stopped, reconfigured, and restarted automatically:
agent-vm --disk 50 claude # Grow disk to 50GB, then run Claude
agent-vm --memory 16 --cpus 8 shell # Increase memory and CPUs, then open shellNote: disk can only be grown, not shrunk.
Running agent-vm setup again updates the base template but does not update existing VMs. You'll see a warning when using a VM cloned from an older base. Use --reset to re-clone:
agent-vm --reset claude # Destroy and re-clone VM, then run Claudeagent-vm --offline claude # Block outbound internet access
agent-vm --readonly shell # Mount project directory as read-only
agent-vm --offline --readonly claude # Both--offline blocks outbound internet from the VM using iptables while preserving host/VM communication (mounts, port forwarding). Useful for ensuring agents don't phone home or download unexpected packages.
--readonly remounts the project directory as read-only. Useful for code review or audit tasks where the agent shouldn't modify files. Both flags are per-session and reset when the VM restarts.
Create this file to install extra tools into the base VM template. It runs once during agent-vm setup, as the default VM user (with sudo available):
# ~/.agent-vm/setup.sh
sudo apt-get install -y postgresql-client
pip install pandas numpyCreate this file to run commands inside every VM on each start. Runs before the per-project runtime script:
# ~/.agent-vm/runtime.sh
export MY_API_KEY="..."Create this file at the root of any project. It runs inside the VM each time a new VM is created for the project, just before you get access. Use it for project-specific setup like installing dependencies or starting services:
# your-project/.agent-vm.runtime.sh
npm install
docker compose up -dThe base VM comes with Chrome DevTools MCP pre-configured for Claude, giving the agent headless browser access.
To add more MCP servers, add them to ~/.claude.json in your ~/.agent-vm/setup.sh, or edit the file directly inside a VM via agent-vm shell. Add entries to the mcpServers object:
{
"mcpServers": {
"chrome-devtools": {
"command": "npx",
"args": ["-y", "chrome-devtools-mcp@latest", "--headless=true", "--isolated=true"]
},
"postgres": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres", "postgresql://localhost:5432/mydb"]
}
}
}agent-vm setupcreates a Debian 13 VM with Lima, runsagent-vm.setup.shinside it to install dev tools + Chrome + agents, and stops it as a reusable base templateagent-vm claude|opencode|codex [args]clones the base template into a persistent per-directory VM, mounts your working directory, runs optional runtime scripts (~/.agent-vm/runtime.shthen.agent-vm.runtime.sh), then launches the agent with full permissions- The VM persists after exit. Running any agent command or
agent-vm shellin the same directory reuses the same VM - Use
agent-vm stopto stop the VM oragent-vm destroyto delete it
Each VM is fully isolated — agents must authenticate independently inside their VM (e.g. claude login). Credentials persist within the VM across restarts but are not shared between VMs or with the host.
| File | Description |
|---|---|
agent-vm.sh |
Main script — source this in your shell config |
agent-vm.setup.sh |
Package installation script that runs inside the base VM during setup |
| Category | Packages |
|---|---|
| Core | git, curl, wget, jq, build-essential, unzip, zip |
| Python | python3, pip, venv |
| Node.js | Node.js 24 LTS (via NodeSource) |
| Search | ripgrep, fd-find |
| Utilities | htop, GitHub CLI (gh) |
| Browser | Chromium (headless), xvfb |
| Containers | Docker Engine, Docker Compose |
| AI | Claude Code, OpenCode, Codex CLI, Chrome DevTools MCP server |
AI coding agents need full permissions to be useful — they install dependencies, run builds, execute tests, start servers. But running npm install or pip install means executing arbitrary third-party code on your machine.
This is not a theoretical risk. The Shai-Hulud worm compromised thousands of npm packages in 2025 by injecting malicious code that runs during npm install. It harvested npm tokens, GitHub PATs, SSH keys, and cloud credentials from developers' machines, then used those credentials to spread to other packages the developer maintained. All of this happened silently, in the background, while the legitimate install appeared normal.
An AI agent running with --dangerously-skip-permissions on your host would give such an attack full access to everything: your SSH keys, your cloud credentials, your browser sessions, your entire filesystem.
agent-vm runs all code inside the VM. The VM only has access to your project directory (read-write mount, or read-only with --readonly). It has no access to your SSH keys, npm tokens, cloud credentials, git config, browser sessions, or anything else on your host. If a supply chain attack executes inside the VM, it finds nothing to steal (except your source code) and nowhere to spread. Use --offline to block internet access entirely.
Meanwhile, your host machine stays clean. You don't need Node.js, Docker, or any dev tooling installed locally. The only host dependency is Lima. Your SSH keys and signing credentials never enter the VM — we recommend running git commit on the host yourself.
| No sandbox | Docker | VM (agent-vm) | |
|---|---|---|---|
| Agent can run any command | Yes | Yes | Yes |
| File system isolation | None | Partial (shared kernel) | Full |
| Network isolation | None | Partial | Optional (--offline) |
| Can run Docker inside | Yes | Requires DinD or socket mount | Yes (native) |
| Kernel-level isolation | None | None (shares host kernel) | Full (separate kernel) |
| Protection from container escapes | None | None | Yes |
| Browser / GUI tools | Host only | Complex setup | Built-in (headless Chromium) |
Docker containers share the host kernel. A motivated attacker (or a compromised dependency running inside the container) could exploit kernel vulnerabilities to escape. A VM runs its own kernel — even root access inside the VM can't reach the host.
A VM also avoids the practical headaches of Docker sandboxing. Docker runs natively inside the VM without Docker-in-Docker hacks. Headless Chromium works out of the box. Lima automatically forwards ports to your host. The agent gets a normal Linux environment where everything just works.
This workflow also replaces Docker Desktop on the Mac, which has become more and more bloated over the years.
MIT