Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
377 changes: 377 additions & 0 deletions docs/adrs/ADR-007-built-in-tools-and-workspace.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,377 @@
# ADR-007: Built-in tools and workspace sandboxing

**Status:** Proposed
**Date:** 2026-03-31

---

## Context

ADR-005 defines tool execution as spawning plugin executables with JSON on
stdin/stdout. This works well for third-party and optional tools, but every
agent needs a baseline set of filesystem tools to be useful — reading files,
editing code, searching content. These operations happen on nearly every LLM
turn and must be fast.

Coding agents (Claude Code, Cursor, Windsurf) have converged on a common tool
set: read, edit, grep, glob, bash. This is not a coincidence — these tools map
directly to what a developer does in a terminal, and LLMs are trained on
enormous amounts of terminal interaction. Providing the same primitives to loom
agents gives them the same capabilities.

The second question is scope: what can an agent's tools access? An unsandboxed
agent with `read` and `bash` can access anything the host user can. This is
fine for a developer sitting at a terminal, but not for a managed agent running
in the background. We need a default boundary.

## Decision

### Built-in tools

The runner includes six built-in tools implemented as functions inside the
runner process. They are always available to every agent. No subprocess is
spawned — tool calls execute in-process for minimal latency.

| Tool | Purpose | Key parameters | Default |
|------|---------|----------------|---------|
| **read** | Read file contents | `path`, `offset`, `limit` | enabled |
| **write** | Create or overwrite a file | `path`, `content` | enabled |
| **edit** | Replace a string in an existing file | `path`, `old_string`, `new_string` | enabled |
| **glob** | Find files matching a pattern | `pattern`, `path` | enabled |
| **grep** | Search file contents by regex | `pattern`, `path`, `glob` | enabled |
| **bash** | Execute a shell command | `command`, `timeout` | disabled |

All built-in tools except bash are enabled by default. Any built-in tool can
be disabled per-agent via configuration:

```yaml
# loom.yml
agents:
researcher:
model: anthropic/claude-sonnet-4-20250514
tools:
bash: true # opt-in
write: false # opt-out
edit: false # opt-out — read-only agent
```

```sh
# CLI overrides
loom run --name researcher --model anthropic/claude-sonnet-4-20250514 \
--enable-tool bash --disable-tool write --disable-tool edit
```

Disabled tools are excluded from the schema sent to the LLM — the model
doesn't know they exist.

Built-in tools use the same JSON Schema interface that plugins use (see
ADR-005). The runner generates their tool definitions and merges them with any
plugin tool definitions before sending the combined list to the LLM.

#### read

Reads file contents with optional offset and line limit. Returns content with
line numbers. Can read text files, and when supported by the LLM, images and
PDFs.

```json
{
"name": "read",
"parameters": {
"path": "src/index.ts",
"offset": 0,
"limit": 200
}
}
```

Returns the file content as a string with line numbers prefixed.

#### write

Creates a new file or overwrites an existing file. Creates parent directories
if they don't exist.

```json
{
"name": "write",
"parameters": {
"path": "src/config.ts",
"content": "export const PORT = 3000;\n"
}
}
```

#### edit

Performs exact string replacement in an existing file. The `old_string` must
match exactly one location in the file (unless `replace_all` is set). This is
safer than write for modifications — it prevents accidentally overwriting
unrelated content.

```json
{
"name": "edit",
"parameters": {
"path": "src/config.ts",
"old_string": "const PORT = 3000",
"new_string": "const PORT = 8080",
"replace_all": false
}
}
```

Fails if `old_string` is not found or matches multiple locations (when
`replace_all` is false).

#### glob

Finds files matching a glob pattern. Returns file paths sorted by modification
time. Useful for discovering project structure.

```json
{
"name": "glob",
"parameters": {
"pattern": "**/*.ts",
"path": "src"
}
}
```

#### grep

Searches file contents using regular expressions. Supports filtering by file
glob and limiting output.

```json
{
"name": "grep",
"parameters": {
"pattern": "function\\s+handle",
"path": "src",
"glob": "*.ts"
}
}
```

Returns matching lines with file paths and line numbers.

#### bash

Executes a shell command and returns stdout, stderr, and exit code. The command
runs in the agent's workspace directory with the agent's environment variables.

```json
{
"name": "bash",
"parameters": {
"command": "bun test --filter auth",
"timeout": 30000
}
}
```

Bash is **opt-in** — disabled by default, enabled per-agent via configuration.
When disabled, the tool is not included in the schema sent to the LLM. The
operator assumes responsibility for what the agent can do with shell access.

### Workspace sandboxing

Every agent has a **workspace** — a directory that its built-in tools are
scoped to. All paths passed to built-in tools are resolved relative to the
workspace. Path traversal outside the workspace is rejected.

```
read("src/index.ts") → OK (relative to workspace)
read("/etc/passwd") → REJECTED (absolute path outside workspace)
read("../../etc/passwd") → REJECTED (traversal outside workspace)
```

#### Default workspace paths

| Mode | Default workspace | Override |
|------|-------------------|----------|
| `loom run` (foreground) | Current working directory (`cwd`) | `--workspace /path` |
| `loom run --detach` | `$LOOM_HOME/agents/{name}/workspace/` | `--workspace /path` |
| `loom up` (via `loom.yml`) | `$LOOM_HOME/agents/{name}/workspace/` | `workspace:` key in `loom.yml` |

The foreground default of `cwd` is intentional: you `cd` into a project and
run an agent against it, just like running any other CLI tool.

For managed agents, the workspace is a dedicated directory inside the agent's
home. This prevents managed agents from accidentally modifying the operator's
files. The operator can override this to point at a project directory when they
want the agent to work on real files.

#### Workspace vs agent directories

The workspace is **separate from the agent's internal directories**:

```
$LOOM_HOME/agents/{name}/
pid
status
model
inbox/
outbox/
logs/
plugins/ ← plugin scoped directories (plugins/{plugin_name}/)
workspace/ ← built-in tools are scoped here
```

Built-in tools cannot access `inbox/`, `outbox/`, `logs/`, or other agent
internals. These are managed by the runner and by plugin tools with explicit
access grants.

### Plugin tools and scoped directories

Plugin tools (ADR-005) operate on their own scoped directories, separate from
the workspace. The plugin declares the directory it needs; the runner creates
it and passes the path at invocation.

```
$LOOM_HOME/agents/{name}/
plugins/
memory/ ← memory plugin's scoped directory
browser/ ← browser plugin's scoped directory
workspace/ ← built-in tools' scope
```

All plugin directories live under `plugins/{plugin_name}/`. The runner creates
the directory on first use and passes the path at invocation.

For example, a **memory** plugin:
- The runner creates `$LOOM_HOME/agents/{name}/plugins/memory/` and passes it
as `scope_dir` in the plugin's invocation JSON
- The plugin reads/writes within its scoped directory
- By default, the plugin cannot see `workspace/`

#### Plugin workspace access

Plugins can be configured to also receive access to the agent's workspace.
This is opt-in per plugin — the operator grants it in the agent or weave
configuration:

```yaml
# loom.yml
agents:
researcher:
model: anthropic/claude-sonnet-4-20250514
plugins:
memory: {} # scoped dir only
code-review:
workspace_access: true # gets both scoped dir + workspace
```

When `workspace_access` is enabled, the runner passes both paths in the
plugin's invocation JSON:

```json
{
"scope_dir": "$LOOM_HOME/agents/researcher/plugins/code-review/",
"workspace_dir": "/path/to/project"
}
```

The plugin decides how to use each. A code-review plugin might read project
files from the workspace while storing its review state in its scoped
directory. The separation still holds — the scoped directory is the plugin's
private state, the workspace is shared read-write access to the project.

Without `workspace_access`, the plugin only receives `scope_dir`. This is
the safe default: most plugins (memory, caching, scheduling) don't need to
see project files.

This separation means:
- **Workspace** = the agent's view of the outside world (project files, data)
- **Plugin scoped directories** = the plugin's private internal state
- **Workspace access** = opt-in grant for plugins that need to operate on project files

### Bash sandboxing

Bash is the most powerful tool and the hardest to sandbox. A shell command can
access the network, spawn processes, and read files outside the workspace via
subprocesses.

loom takes the pragmatic approach: **bash is opt-in, and the operator assumes
responsibility**. When enabled:

- The command runs with `cwd` set to the workspace
- The agent's `env` is applied
- No network or process restrictions are enforced by the runner

This matches how coding agents work today — bash is available, powerful, and
trusted. Operators who need stronger isolation can run agents in containers
or use OS-level sandboxing (namespaces, seccomp, etc.). A future ADR may
define a restricted bash mode with network and filesystem constraints.

## Consequences

### Good

**Fast tool execution.** Built-in tools run in-process with no subprocess
overhead. A read-edit-grep cycle that would require three process spawns as
plugins executes in microseconds.

**Familiar tool set.** Developers and LLMs both know these tools. The same
read/edit/grep/glob/bash pattern used by coding agents works here. No new
abstractions to learn.

**Safe by default.** Workspace sandboxing means an agent can't accidentally
(or intentionally) modify files outside its designated area. Operators
explicitly choose what the agent can access.

**Clean separation.** Workspace for project files, plugin directories for
internal state. Neither can see the other. This prevents an agent from
accidentally overwriting its own memory files via `edit`, or reading its raw
inbox messages.

### Tricky

**Bash escapes the sandbox.** A `bash` command can `curl`, write to `/tmp`,
or read outside the workspace. This is accepted — bash is opt-in and operators
take responsibility. Stronger sandboxing is an OS-level concern.

**Workspace override requires trust.** When an operator sets
`--workspace /path/to/my-project`, the agent gets read-write access to that
entire directory tree. This is powerful and necessary, but the operator must
understand the implications.

**Tool configuration is per-agent.** Operators can disable any built-in tool,
which means different agents in the same weave may have different tool sets.
This is intentional — a read-only research agent shouldn't have write/edit,
while a coding agent needs everything. The runner resolves the tool list at
startup and it's fixed for the agent's lifetime.

**Plugin directory proliferation.** Each plugin gets its own directory under
`plugins/`. Many plugins means many subdirectories, but they are contained
under a single parent — `ls agents/{name}/plugins/` shows the full layout.

## Alternatives considered

**All tools as plugins (subprocess spawn):**
Consistent with ADR-005, but unacceptably slow for tools called on every LLM
turn. A read-edit-grep cycle would spawn three processes. Rejected.

**No workspace sandboxing (full filesystem access):**
Simpler implementation, but a managed background agent with write access to
`/` is a liability. The default should be safe. Rejected for default behavior;
available via `--workspace /`.

**Restricted bash via seccomp/namespaces:**
The runner could enforce network and filesystem restrictions on bash commands
using OS primitives. Feasible but complex, platform-specific, and out of scope
for v1. Deferred to a future ADR.

**Capability-based tool permissions (mandatory declarations):**
Each agent must explicitly declare which tools it needs. Rejected — too much
friction for the common case. The opt-out model (everything on by default,
disable what you don't want) is simpler and covers the same use cases.

## References

- ADR-001: Unix process model — agents have `cwd` and `env`
- ADR-002: Filesystem as process table — agent directory layout
- ADR-005: Runner architecture — tool execution protocol and plugin spawning
- ADR-006: CLI and lifecycle — `loom run` modes and workspace defaults
Loading