agentc7 · andrew-jon-p7a · May 13, 2026 · May 13, 2026
diff --git a/docs/concepts/activity-and-traces.mdx b/docs/concepts/activity-and-traces.mdx
@@ -198,9 +198,15 @@ WAL, online prune doesn't block live writes for long.
   no `llm_exchange` entries — the proxy doesn't speak HPACK yet.
   In practice the Anthropic SDK defaults to HTTP/1.1 for
   `/v1/messages`, so this is rarely hit.
-- **Anthropic parser only.** OpenAI / Gemini / Mistral land as
+- **Anthropic parser only.** OpenAI / Azure OpenAI traffic IS
+  captured (those hosts are on the MITM allowlist) but lands as
   `opaque_http`. Codex traces today fall in this bucket — adding
-  typed parsers is a follow-up.
+  typed parsers for OpenAI's Chat Completions / Responses APIs
+  is a follow-up. Gemini / Mistral / Bedrock are not currently
+  on the allowlist — their traffic passes through unmodified
+  and produces no activity events. See
+  [tracing.mdx → Host allowlist](/docs/tracing#host-allowlist)
+  for how to extend.
 - **Uploader queue cap.** The uploader caps in-flight at 1000
   events / 1 MB and evicts oldest-first under sustained broker
   unreachability. Events dropped here won't appear later.

diff --git a/docs/tracing.mdx b/docs/tracing.mdx
@@ -15,13 +15,30 @@ without embedding observability hooks into the agent itself.
 
 The runner runs **upstream** of the agent process and intercepts
 its network traffic at the TLS layer via a **loopback MITM TLS
-proxy** with a per-session local CA. Every HTTPS request the
-agent makes is transparently decrypted by the proxy, observed as
-plaintext, re-encrypted toward the real upstream, and passed
-through. From the upstream's point of view we are a normal TLS
-client doing standard SNI + cert validation — it can't tell us
-apart from any other user-agent, which means OAuth flows, token
-refreshes, streaming responses, and SSE all work identically.
+proxy** with a per-session local CA. HTTPS requests to **known
+LLM-provider hosts** (Anthropic, OpenAI, Azure OpenAI — see the
+allowlist below) are transparently decrypted by the proxy,
+observed as plaintext, re-encrypted toward the real upstream,
+and passed through. From the upstream's point of view we are a
+normal TLS client doing standard SNI + cert validation — it can't
+tell us apart from any other user-agent, which means OAuth flows,
+token refreshes, streaming responses, and SSE all work
+identically.
+
+Traffic to any **other** host (GitHub, npm, package registries,
+agent-spawned `curl`/`git`/`python` calls, telemetry endpoints,
+etc.) is **not** decrypted. The proxy still routes those
+connections — it has to, because `HTTPS_PROXY` is set on the
+agent child — but it passes the TCP bytes through as a raw
+tunnel. The agent's TLS client negotiates directly with the real
+upstream cert and the agent's system trust store is what
+validates it; the runner never sees plaintext.
+
+This is the honest privacy claim: **ac7 only decrypts traffic to
+LLM provider hosts on a maintained allowlist.** Adding a host
+requires editing
+[`known-hosts.ts`](https://github.com/anthropics/agentc7) (and
+ideally a parser for whatever shape lives behind it).
 
 Zero external tools. No tshark. No pcap. No SSLKEYLOGFILE
 shenanigans. Just Node's built-in `crypto` + `tls` + a small
@@ -124,6 +141,39 @@ Node-only and would confuse reqwest.
 
 For codex specifics see [runners/codex](/docs/runners/codex).
 
+## The host allowlist
+
+The proxy is host-agnostic: every CONNECT lands in the same code
+path. The decision of "decrypt this session" vs "pass it through
+unmodified" is gated by a single predicate, `isKnownLlmHost`,
+exported from `runtime/trace/known-hosts.ts`. Production wires
+that predicate into the proxy at startup; tests can substitute
+any predicate they want to drive both code paths.
+
+Current allowlist patterns (regex, case-insensitive,
+anchored against the apex):
+
+```
+(?:^|\.)anthropic\.com$       # api.anthropic.com, console.anthropic.com, auth.anthropic.com
+(?:^|\.)openai\.com$          # api.openai.com, auth.openai.com
+(?:^|\.)openai\.azure\.com$   # <customer>.openai.azure.com
+```
+
+The bar for adding a host: the agent (or its bundled tools)
+makes inference-related calls to it — model invocations, token
+refresh against the same provider, provider-specific telemetry
+that the trace pipeline knows how to parse. Hosts we just
+"happen to see" (a package registry the agent installs from, a
+GitHub API call from an MCP tool) do not belong here.
+
+Non-allowlisted CONNECTs produce **no activity events** today.
+At the proxy layer the runner still observes session metadata
+(host, port, byte counts, duration) via the `onSessionEnd`
+callback, but this data isn't currently surfaced into the
+activity stream — promoting it to a new `network_session` event
+kind is a candidate follow-up if presence visibility for non-LLM
+hosts becomes valuable.
+
 ## How the MITM works
 
 When the agent issues `CONNECT api.anthropic.com:443` through the
@@ -250,23 +300,32 @@ the work. ac7 mitigates this with defense in depth:
    only to `127.0.0.1` on a random ephemeral port. The CA is
    generated fresh per runner process; its cert is written with
    `0o600`; its private key never touches disk.
-2. **Redaction at parse time.** Secrets are replaced with
+2. **Scoped to an LLM-host allowlist.** TLS termination only
+   fires for hosts on the `known-hosts.ts` allowlist. Everything
+   else passes through as a raw TCP tunnel — the agent's TLS
+   client validates the real upstream cert against the agent's
+   system trust store and the runner never observes plaintext.
+   This is also why setting `SSL_CERT_FILE` / `REQUESTS_CA_BUNDLE`
+   / `CURL_CA_BUNDLE` / `GIT_SSL_CAINFO` is deliberately *not*
+   done: they would replace the system trust store with our
+   single-CA pem and break every non-allowlisted HTTPS call.
+3. **Redaction at parse time.** Secrets are replaced with
    `[REDACTED]` before entries leave the runner. The server
    never sees the plaintext token.
-3. **Permission-gated view.** Only members with `activity.read`
+4. **Permission-gated view.** Only members with `activity.read`
    (or the captured member themselves) can read the activity
    stream. Watchers, originators, and assignees of OTHER
    members' objectives all get 403 on the GET endpoint.
-4. **CA cert deleted on runner exit.** The cert PEM is unlinked
+5. **CA cert deleted on runner exit.** The cert PEM is unlinked
    on every exit path (normal, SIGINT, SIGTERM,
    uncaughtException).
-5. **`.mcp.json` restored on every exit** (claude-code only) —
+6. **`.mcp.json` restored on every exit** (claude-code only) —
    the original is backed up and restored idempotently.
-6. **Ephemeral CODEX_HOME removed on exit** (codex only) — the
+7. **Ephemeral CODEX_HOME removed on exit** (codex only) — the
    entire temp directory is `rm -rf`'d, including the symlink to
    the user's `~/.codex/auth.json` (the symlink is removed; the
    real file isn't).
-7. **Upload is best-effort.** If the upload fails, the runner
+8. **Upload is best-effort.** If the upload fails, the runner
    logs and moves on. It does NOT retry past the queue cap, and
    it does NOT persist the trace to disk.
 
@@ -344,11 +403,15 @@ audit requirements.
   ALPN) produce no `llm_exchange` events. Adding an HPACK-aware
   parser is a follow-up. In practice the Anthropic SDK defaults
   to HTTP/1.1 for `/v1/messages`, so this is rarely hit.
-- **Anthropic parser only.** Other LLM providers (OpenAI,
-  Gemini, Mistral) land as `opaque_http`. **Codex traces today
-  fall in this bucket** — adding a typed OpenAI parser so codex
-  traces render the same way claude-code traces do is a
-  follow-up.
+- **Anthropic parser only.** OpenAI / Azure OpenAI traffic IS
+  decrypted (those hosts are on the allowlist) but lands as
+  `opaque_http` because there's no structured parser for the
+  Chat Completions / Responses API shapes yet. **Codex traces
+  today fall in this bucket** — adding a typed OpenAI parser so
+  codex traces render the same way claude-code traces do is a
+  follow-up. Other providers (Gemini, Mistral, Bedrock) are
+  passed through unmodified (not on the allowlist); adding them
+  is a code change to `known-hosts.ts` plus, ideally, a parser.
 - **Uploader queue cap.** The uploader caps in-flight at 1000
   events / 1 MB and evicts oldest-first under sustained broker
   unreachability. Events dropped here won't appear in the UI.

diff --git a/packages/cli/src/commands/claude-code.ts b/packages/cli/src/commands/claude-code.ts
@@ -39,8 +39,10 @@ import { resolve } from 'node:path';
 import { DEFAULT_PORT, ENV } from '@agentc7/sdk/protocol';
 import {
   ClaudeCodeAdapterError,
+  type ClaudeSettingsHandle,
   findClaudeBinary,
   type McpConfigHandle,
+  prepareClaudeSettings,
   prepareMcpConfig,
 } from '../runtime/agents/claude-code.js';
 import { HUD_HEIGHT, startHud } from '../runtime/hud.js';
@@ -226,6 +228,7 @@ export async function runClaudeCodeCommand(input: ClaudeCodeCommandInput): Promi
   //    here tears down the runner before propagating so we don't leave
   //    an orphaned IPC socket.
   let mcpHandle: McpConfigHandle;
+  let settingsHandle: ClaudeSettingsHandle | null = null;
   // Auto-detect the bridge command from the currently-running cli
   // process. `process.execPath` is the node binary; `process.argv[1]`
   // is the absolute path to the cli's entry script (dist/index.js in
@@ -277,6 +280,26 @@ export async function runClaudeCodeCommand(input: ClaudeCodeCommandInput): Promi
   }
   log('claude-code: .mcp.json prepared', { path: mcpHandle.path });
 
+  // 3b. If tracing is enabled, write a `.claude/settings.json` hook
+  //    config so PreToolUse / PostToolUse events drive the busy
+  //    signal. Skipped under `--no-trace` since there's no hook
+  //    endpoint to point at. Failures here are non-fatal — they
+  //    only degrade the busy-signal accuracy during tool runs, not
+  //    correctness of the agent itself.
+  if (runner.traceHost) {
+    try {
+      settingsHandle = prepareClaudeSettings({
+        cwd,
+        hookUrl: runner.traceHost.hookEndpointUrl,
+      });
+      log('claude-code: .claude/settings.json prepared', { path: settingsHandle.path });
+    } catch (err) {
+      log('claude-code: .claude/settings.json prepare failed (busy hooks disabled)', {
+        error: err instanceof Error ? err.message : String(err),
+      });
+    }
+  }
+
   // 4. Spawn claude. In interactive sessions we route through a
   //    node-pty relay so we can (a) reserve the bottom `HUD_HEIGHT`
   //    rows for the ac7 status strip and (b) own the stream for
@@ -302,6 +325,15 @@ export async function runClaudeCodeCommand(input: ClaudeCodeCommandInput): Promi
         error: err instanceof Error ? err.message : String(err),
       });
     }
+    if (settingsHandle) {
+      try {
+        settingsHandle.restore();
+      } catch (err) {
+        log('claude-code: settings.json restore threw', {
+          error: err instanceof Error ? err.message : String(err),
+        });
+      }
+    }
     await runner.shutdown(reason).catch((err) => {
       log('claude-code: runner shutdown threw', {
         error: err instanceof Error ? err.message : String(err),
@@ -435,6 +467,11 @@ export async function runClaudeCodeCommand(input: ClaudeCodeCommandInput): Promi
     } catch {
       /* ignore */
     }
+    try {
+      settingsHandle?.restore();
+    } catch {
+      /* ignore */
+    }
   };
   process.on('uncaughtException', onUncaught);
   process.on('unhandledRejection', onUncaught);

diff --git a/packages/cli/src/commands/codex.ts b/packages/cli/src/commands/codex.ts
@@ -184,6 +184,10 @@ export async function runCodexCommand(input: CodexCommandInput): Promise<number>
       cwd,
       model: input.model,
       presence,
+      // Share the trace host's busy signal so codex tool-lifecycle
+      // notifications and MITM-derived LLM bumps both feed one
+      // observable. Null when --no-trace.
+      busy: runner.traceHost?.busy,
       log,
     });
   } catch (err) {