Agent-Browser MCP Server

A highly secure, zero-dependency, local-only Model Context Protocol (MCP) server wrapping Vercel's agent-browser.

This server provides AI agents (like Claude via Claude Desktop) with a fast, token-efficient, and secure interface for browser automation.

⚡ Why Agent-Browser?

Traditional Playwright-based MCP implementations often flood LLM context windows with massive raw DOM states and complex tool schemas.

agent-browser solves this by utilizing a fast CLI backed by a persistent daemon, returning hyper-compact, ref-based accessibility snapshots (e.g., @e1, @e42). This dramatically reduces token usage, speeds up agent reasoning, and leads to more reliable interactions.

This MCP server encapsulates agent-browser's speed while providing a strictly governed, type-safe JSON-RPC interface designed specifically for AI agents.

🏗 Architecture

The system is architected in three layers, optimized for multi-core processors and local execution:

Layer 1: The MCP Server (This Project)
A pure ESM Node.js server implementing the JSON-RPC 2.0 protocol over stdio. It handles schema governance, strict argument validation, session routing, and security boundaries.
Layer 2: The Command Layer
The agent-browser native CLI, which provides sub-millisecond command routing.
Layer 3: The Engine
A persistent background Node.js daemon that manages Chromium instances via the Chrome DevTools Protocol (CDP).

🛡️ Security & Safety Posture

Security is paramount when granting AI agents local browser execution access. This server implements defense-in-depth:

Zero External Dependencies: The MCP server is built using only built-in Node.js modules (fs, crypto, child_process, etc.). No zod, no external SDKs. This eliminates supply-chain risks.
Strict Command Sanitization: Arguments are validated against strict allowlists (e.g., action enums, regex-enforced ref patterns like ^@e\d+$).
No Shell Execution: The agent-browser CLI is invoked using child_process.execFile with shell: false. Arguments are passed as literal arrays, making shell injection impossible.
Network Boundaries: The server communicates exclusively via standard I/O (stdio). It binds to no network ports, eliminating external network ingress vectors.
Encrypted State: Session state (cookies, local storage) is stored locally and encrypted at rest using a 256-bit AES key (AGENT_BROWSER_ENCRYPTION_KEY).
Resource Guardrails: Hard caps on concurrent sessions (MCP_MAX_SESSIONS), execution timeouts (MCP_COMMAND_TIMEOUT_MS), and maximum output truncation to prevent runaway CDP processes and context-window flooding.
Sanitized Logging: All diagnostic logs are written strictly to stderr with automated redaction of absolute home paths and secrets.

🚀 Installation & Setup

Prerequisites

Node.js >= 18.0.0
macOS, Linux, or Windows (WSL recommended)

1. Clone & Install

git clone https://github.com/yourusername/agent-browser-mcp.git
cd agent-browser-mcp
npm install

2. Auto-Generate Configuration

Run the included setup script to automatically generate your Claude Desktop configuration. This script securely generates an encryption key and resolves the absolute paths required for the server.

node setup.js

This will create a generated_mcp_config.json file in your repository folder, with the following format:

{
  "mcpServers": {
    "agent-browser": {
      "command": "node",
      "args": ["/absolute/path/to/agent-browser-mcp/index.js"],
      "env": {
        "AGENT_BROWSER_ENCRYPTION_KEY": "<paste-your-64-char-hex-key-here>",
        "MCP_SESSION_DIR": "/absolute/path/to/agent-browser-mcp/.sessions",
        "MCP_LOG_LEVEL": "info",
        "MCP_MAX_SESSIONS": "5",
        "MCP_COMMAND_TIMEOUT_MS": "30000",
        "AGENT_BROWSER_HEADED": "true"
      }
    }
  }
}

⚙️ Claude Desktop Configuration

Copy the contents of the newly created generated_mcp_config.json file and add it to your claude_desktop_config.json file.

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

🧰 Available Tools

The server exposes 15 highly-optimized tools for agent interaction:

Tool	Description
`browser_navigate`	Navigate to a URL. Returns page title and current URL.
`browser_snapshot`	Returns a compact accessibility-tree snapshot with `@eN` refs.
`browser_interact`	Interact with an element via its ref (click, fill, type, hover, check, etc.).
`browser_get_text`	Extract visible text content from a specific ref or the full page.
`browser_press`	Press a keyboard key (e.g., `Enter`, `Tab`, `Control+a`).
`browser_scroll`	Scroll the page (`up`, `down`, `left`, `right`).
`browser_screenshot`	Take a screenshot of the current viewport or full page.
`browser_get_url`	Return the current active URL.
`browser_select`	Select options in a `<select>` dropdown by ref.
`browser_navigate_back`	Go back in history.
`browser_navigate_forward`	Go forward in history.
`browser_reload`	Reload the current page.
`browser_wait`	Wait for an element to appear (by ref) or for a specific duration (ms).
`session_list`	List active, isolated browser sessions and their idle metrics.
`session_teardown`	Securely destroy a named session and wipe its saved state.

💡 Typical Agent Workflow

An AI agent using this server will typically follow a loop like this:

Navigate: browser_navigate({ url: "https://example.com" })
Observe: browser_snapshot({ interactive_only: true })
Act: browser_interact({ action: "click", ref: "@e5" })
Verify: browser_wait({ ref: "@e12" }) followed by another browser_snapshot.

Because the agent interacts with @eN identifiers rather than raw DOM nodes or complex CSS selectors, the context window remains clean, and interactions are significantly less prone to breakage from minor UI changes.

🛠️ Development & Testing

The project includes a lightweight, zero-dependency test runner that tests validation logic and session state behavior.

# Run the test suite (42 tests)
npm test

Project Structure

index.js: Main entry point.
src/server.js: JSON-RPC 2.0 protocol handler and request router.
src/tools.js: MCP tool schemas and dispatch logic.
src/validate.js: Strict input validators.
src/runner.js: Secure child process execution.
src/session.js: Session tracking and teardown.
src/logger.js: Sanitized stderr logging.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
test		test
.env.example		.env.example
.gitignore		.gitignore
GEMINI.md		GEMINI.md
LICENSE		LICENSE
README.md		README.md
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json
progress.md		progress.md
setup.js		setup.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent-Browser MCP Server

⚡ Why Agent-Browser?

🏗 Architecture

🛡️ Security & Safety Posture

🚀 Installation & Setup

Prerequisites

1. Clone & Install

2. Auto-Generate Configuration

⚙️ Claude Desktop Configuration

🧰 Available Tools

💡 Typical Agent Workflow

🛠️ Development & Testing

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agent-Browser MCP Server

⚡ Why Agent-Browser?

🏗 Architecture

🛡️ Security & Safety Posture

🚀 Installation & Setup

Prerequisites

1. Clone & Install

2. Auto-Generate Configuration

⚙️ Claude Desktop Configuration

🧰 Available Tools

💡 Typical Agent Workflow

🛠️ Development & Testing

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages