A highly secure, zero-dependency, local-only Model Context Protocol (MCP) server wrapping Vercel's agent-browser.
This server provides AI agents (like Claude via Claude Desktop) with a fast, token-efficient, and secure interface for browser automation.
Traditional Playwright-based MCP implementations often flood LLM context windows with massive raw DOM states and complex tool schemas.
agent-browser solves this by utilizing a fast CLI backed by a persistent daemon, returning hyper-compact, ref-based accessibility snapshots (e.g., @e1, @e42). This dramatically reduces token usage, speeds up agent reasoning, and leads to more reliable interactions.
This MCP server encapsulates agent-browser's speed while providing a strictly governed, type-safe JSON-RPC interface designed specifically for AI agents.
The system is architected in three layers, optimized for multi-core processors and local execution:
- Layer 1: The MCP Server (This Project)
A pure ESM Node.js server implementing the JSON-RPC 2.0 protocol overstdio. It handles schema governance, strict argument validation, session routing, and security boundaries. - Layer 2: The Command Layer
Theagent-browsernative CLI, which provides sub-millisecond command routing. - Layer 3: The Engine
A persistent background Node.js daemon that manages Chromium instances via the Chrome DevTools Protocol (CDP).
Security is paramount when granting AI agents local browser execution access. This server implements defense-in-depth:
- Zero External Dependencies: The MCP server is built using only built-in Node.js modules (
fs,crypto,child_process, etc.). Nozod, no external SDKs. This eliminates supply-chain risks. - Strict Command Sanitization: Arguments are validated against strict allowlists (e.g., action enums, regex-enforced ref patterns like
^@e\d+$). - No Shell Execution: The
agent-browserCLI is invoked usingchild_process.execFilewithshell: false. Arguments are passed as literal arrays, making shell injection impossible. - Network Boundaries: The server communicates exclusively via standard I/O (
stdio). It binds to no network ports, eliminating external network ingress vectors. - Encrypted State: Session state (cookies, local storage) is stored locally and encrypted at rest using a 256-bit AES key (
AGENT_BROWSER_ENCRYPTION_KEY). - Resource Guardrails: Hard caps on concurrent sessions (
MCP_MAX_SESSIONS), execution timeouts (MCP_COMMAND_TIMEOUT_MS), and maximum output truncation to prevent runaway CDP processes and context-window flooding. - Sanitized Logging: All diagnostic logs are written strictly to
stderrwith automated redaction of absolute home paths and secrets.
- Node.js >= 18.0.0
- macOS, Linux, or Windows (WSL recommended)
git clone https://github.com/yourusername/agent-browser-mcp.git
cd agent-browser-mcp
npm installRun the included setup script to automatically generate your Claude Desktop configuration. This script securely generates an encryption key and resolves the absolute paths required for the server.
node setup.jsThis will create a generated_mcp_config.json file in your repository folder, with the following format:
{
"mcpServers": {
"agent-browser": {
"command": "node",
"args": ["/absolute/path/to/agent-browser-mcp/index.js"],
"env": {
"AGENT_BROWSER_ENCRYPTION_KEY": "<paste-your-64-char-hex-key-here>",
"MCP_SESSION_DIR": "/absolute/path/to/agent-browser-mcp/.sessions",
"MCP_LOG_LEVEL": "info",
"MCP_MAX_SESSIONS": "5",
"MCP_COMMAND_TIMEOUT_MS": "30000",
"AGENT_BROWSER_HEADED": "true"
}
}
}
}Copy the contents of the newly created generated_mcp_config.json file and add it to your claude_desktop_config.json file.
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
The server exposes 15 highly-optimized tools for agent interaction:
| Tool | Description |
|---|---|
browser_navigate |
Navigate to a URL. Returns page title and current URL. |
browser_snapshot |
Returns a compact accessibility-tree snapshot with @eN refs. |
browser_interact |
Interact with an element via its ref (click, fill, type, hover, check, etc.). |
browser_get_text |
Extract visible text content from a specific ref or the full page. |
browser_press |
Press a keyboard key (e.g., Enter, Tab, Control+a). |
browser_scroll |
Scroll the page (up, down, left, right). |
browser_screenshot |
Take a screenshot of the current viewport or full page. |
browser_get_url |
Return the current active URL. |
browser_select |
Select options in a <select> dropdown by ref. |
browser_navigate_back |
Go back in history. |
browser_navigate_forward |
Go forward in history. |
browser_reload |
Reload the current page. |
browser_wait |
Wait for an element to appear (by ref) or for a specific duration (ms). |
session_list |
List active, isolated browser sessions and their idle metrics. |
session_teardown |
Securely destroy a named session and wipe its saved state. |
An AI agent using this server will typically follow a loop like this:
- Navigate:
browser_navigate({ url: "https://example.com" }) - Observe:
browser_snapshot({ interactive_only: true }) - Act:
browser_interact({ action: "click", ref: "@e5" }) - Verify:
browser_wait({ ref: "@e12" })followed by anotherbrowser_snapshot.
Because the agent interacts with @eN identifiers rather than raw DOM nodes or complex CSS selectors, the context window remains clean, and interactions are significantly less prone to breakage from minor UI changes.
The project includes a lightweight, zero-dependency test runner that tests validation logic and session state behavior.
# Run the test suite (42 tests)
npm testindex.js: Main entry point.src/server.js: JSON-RPC 2.0 protocol handler and request router.src/tools.js: MCP tool schemas and dispatch logic.src/validate.js: Strict input validators.src/runner.js: Secure child process execution.src/session.js: Session tracking and teardown.src/logger.js: Sanitized stderr logging.
MIT