Skip to content

Trace MITM scoping + hybrid busy signal + stuck-handle safety net#40

Merged
andrew-jon-p7a merged 1 commit into
mainfrom
feat/scoped-mitm-plus-trace-work
May 13, 2026
Merged

Trace MITM scoping + hybrid busy signal + stuck-handle safety net#40
andrew-jon-p7a merged 1 commit into
mainfrom
feat/scoped-mitm-plus-trace-work

Conversation

@andrew-jon-p7a
Copy link
Copy Markdown
Contributor

Scoped MITM (LLM-host allowlist)
  - Add `runtime/trace/known-hosts.ts` exporting `isKnownLlmHost` and `KNOWN_LLM_HOST_PATTERNS`. Seeded with Anthropic, OpenAI, and Azure OpenAI subdomain patterns. Single source of truth the proxy and the decoder both consult.
  - Proxy now takes a `shouldMitm?: (host) => boolean` predicate. `host.ts` wires `isKnownLlmHost`. Hosts off the allowlist fall through to the existing raw TCP tunnel — the agent's TLS client talks straight to the real upstream cert, system trust applies, we never see plaintext. Hosts on the allowlist get the MITM treatment as before. Predicate omission preserves legacy "decrypt everything" for tests. - `anthropic.ts` re-imports the regex from `known-hosts.ts` plus a module-load sanity check so the decoder and allowlist can't drift silently. - Practical effect: curl / git / python / wget invocations from an agent (or from the bundled tools an agent calls) now succeed against any HTTPS host without needing `SSL_CERT_FILE` / `REQUESTS_CA_BUNDLE` / `CURL_CA_BUNDLE` / `GIT_SSL_CAINFO` env-var injection. Those vars are deliberately NOT set — under scoped MITM they would replace system trust with our single-CA pem and break every non-allowlisted call. - Honest privacy claim: ac7 only decrypts traffic to LLM-provider hosts on a maintained allowlist. Codex `CODEX_CA_CERTIFICATE` stays (Rust/reqwest still needs it for the MITM'd OpenAI hosts). - Docs: `tracing.mdx` reframed (new "Host allowlist" section, updated overview + security posture); `concepts/activity-and- traces.mdx` limitations corrected.

Hybrid "agent is working" signal - `BusySignal` gains per-source counters keyed by a `BusySource` union (`'llm_inflight' | 'tool_inflight'`). `start(source?)` defaults to `llm_inflight` for backwards-compat with existing MITM call sites. Public surface stays a single `busy: boolean` so the UI never has to merge state. New `getSourceCounts()` surfaces which feeder is wedged when things go wrong. - Codex: new `agents/codex/busy-sniff.ts` attaches to the existing JSON-RPC client. Bumps `tool_inflight` on `item/started` for `commandExecution` / `fileChange` / `mcpToolCall`, drains on matching `item/completed`, and sweeps via `turn/completed` plus teardown `drain()`. Zero subprocess overhead — we already proxy the JSON-RPC stream, just inspecting notifications inline. - Claude Code: new `runtime/trace/hook-server.ts` binds a loopback HTTP endpoint at TraceHost startup; new `prepareClaudeSettings` writes `.claude/settings.json` with `type: "http"` hook entries for `PreToolUse` / `PostToolUse` / `PostToolUseFailure` pointing at it. Same backup-then-restore discipline as `prepareMcpConfig`, with `x_ac7_busy_feeder: true` marker so stale entries from a prior crash get auto-purged. - Net result: the indicator now lights up during model-in-flight AND during tool execution windows (bash, file edits, MCP tools) that the LLM-call bump alone wouldn't cover.

Stuck-handle safety net
  - Reported failure: TUI interrupt (Ctrl+C in claude-code) could leave a handle stuck in `pendingHandles` because the keep-alive socket survives and `onSessionEnd` never fires. With the count stuck > 0, the reporter heartbeats `busy:true` to the broker every 10s indefinitely.
  - Fix 1: each `start()` returns a handle with an auto-finish timer. Defaults per source — 5 min `llm_inflight`, 15 min `tool_inflight`. Caller can pass `maxAgeMs` to override or `Infinity` to opt out. Timer is `unref()`'d so it can't keep the process alive. Auto-finish logs a diagnostic with source + age for follow-up investigation. - Fix 2: new `BusySignal.forceFinishAll()` drains every live handle and emits one busy→idle transition. `TraceHost.close()` calls it after proxy + hook server shutdown, with a pre-drain `getSourceCounts()` snapshot in the diagnostic log so leaks tell us which source they came from. Silent on clean teardowns. - SIGKILL is still bounded by the existing server-side 30s TTL — nothing the runner can do post `kill -9`.

Tests - 13 new tests for known-hosts dispatch + proxy gating (MITM on accepted hosts produces plaintext; rejected hosts produce ciphertext-only chunks with `mitm: false` session metadata). - 9 new tests for the hook server (PreToolUse / Post lifecycle, duplicate / unknown ids, malformed bodies, drain-on-close). - 7 new tests for the codex sniff driving real JSON-RPC notifications through real `attachCodexBusySniff` (tool types bump, non-tool types don't, sweep on turn/completed, explicit drain). - 6 new tests for `prepareClaudeSettings` (create, merge, stale- entry purge, corrupt-JSON refusal, idempotent restore, `.claude/` preservation when other files live there). - 16 new tests for `BusySignal` upgrades (per-source counters, max-age timers, default-source fallback, forceFinishAll diagnostics, TraceHost teardown integration). - Test count: 220 → 258 (+38).

    Scoped MITM (LLM-host allowlist)
      - Add `runtime/trace/known-hosts.ts` exporting `isKnownLlmHost`
        and `KNOWN_LLM_HOST_PATTERNS`. Seeded with Anthropic, OpenAI,
        and Azure OpenAI subdomain patterns. Single source of truth
        the proxy and the decoder both consult.
      - Proxy now takes a `shouldMitm?: (host) => boolean` predicate.
        `host.ts` wires `isKnownLlmHost`. Hosts off the allowlist fall
        through to the existing raw TCP tunnel — the agent's TLS client
        talks straight to the real upstream cert, system trust applies,
        we never see plaintext. Hosts on the allowlist get the MITM
        treatment as before. Predicate omission preserves legacy
        "decrypt everything" for tests.
      - `anthropic.ts` re-imports the regex from `known-hosts.ts` plus
        a module-load sanity check so the decoder and allowlist can't
        drift silently.
      - Practical effect: curl / git / python / wget invocations from
        an agent (or from the bundled tools an agent calls) now succeed
        against any HTTPS host without needing `SSL_CERT_FILE` /
        `REQUESTS_CA_BUNDLE` / `CURL_CA_BUNDLE` / `GIT_SSL_CAINFO`
        env-var injection. Those vars are deliberately NOT set — under
        scoped MITM they would replace system trust with our single-CA
        pem and break every non-allowlisted call.
      - Honest privacy claim: ac7 only decrypts traffic to LLM-provider
        hosts on a maintained allowlist. Codex `CODEX_CA_CERTIFICATE`
        stays (Rust/reqwest still needs it for the MITM'd OpenAI hosts).
      - Docs: `tracing.mdx` reframed (new "Host allowlist" section,
        updated overview + security posture); `concepts/activity-and-
        traces.mdx` limitations corrected.

    Hybrid "agent is working" signal
      - `BusySignal` gains per-source counters keyed by a `BusySource`
        union (`'llm_inflight' | 'tool_inflight'`). `start(source?)`
        defaults to `llm_inflight` for backwards-compat with existing
        MITM call sites. Public surface stays a single `busy: boolean`
        so the UI never has to merge state. New `getSourceCounts()`
        surfaces which feeder is wedged when things go wrong.
      - Codex: new `agents/codex/busy-sniff.ts` attaches to the existing
        JSON-RPC client. Bumps `tool_inflight` on `item/started` for
        `commandExecution` / `fileChange` / `mcpToolCall`, drains on
        matching `item/completed`, and sweeps via `turn/completed` plus
        teardown `drain()`. Zero subprocess overhead — we already proxy
        the JSON-RPC stream, just inspecting notifications inline.
      - Claude Code: new `runtime/trace/hook-server.ts` binds a loopback
        HTTP endpoint at TraceHost startup; new `prepareClaudeSettings`
        writes `.claude/settings.json` with `type: "http"` hook entries
        for `PreToolUse` / `PostToolUse` / `PostToolUseFailure` pointing
        at it. Same backup-then-restore discipline as `prepareMcpConfig`,
        with `x_ac7_busy_feeder: true` marker so stale entries from a
        prior crash get auto-purged.
      - Net result: the indicator now lights up during model-in-flight
        AND during tool execution windows (bash, file edits, MCP tools)
        that the LLM-call bump alone wouldn't cover.

    Stuck-handle safety net
      - Reported failure: TUI interrupt (Ctrl+C in claude-code) could
        leave a handle stuck in `pendingHandles` because the keep-alive
        socket survives and `onSessionEnd` never fires. With the count
        stuck > 0, the reporter heartbeats `busy:true` to the broker
        every 10s indefinitely.
      - Fix 1: each `start()` returns a handle with an auto-finish
        timer. Defaults per source — 5 min `llm_inflight`, 15 min
        `tool_inflight`. Caller can pass `maxAgeMs` to override or
        `Infinity` to opt out. Timer is `unref()`'d so it can't keep
        the process alive. Auto-finish logs a diagnostic with source +
        age for follow-up investigation.
      - Fix 2: new `BusySignal.forceFinishAll()` drains every live
        handle and emits one busy→idle transition. `TraceHost.close()`
        calls it after proxy + hook server shutdown, with a pre-drain
        `getSourceCounts()` snapshot in the diagnostic log so leaks
        tell us which source they came from. Silent on clean teardowns.
      - SIGKILL is still bounded by the existing server-side 30s TTL —
        nothing the runner can do post `kill -9`.

    Tests
      - 13 new tests for known-hosts dispatch + proxy gating (MITM on
        accepted hosts produces plaintext; rejected hosts produce
        ciphertext-only chunks with `mitm: false` session metadata).
      - 9 new tests for the hook server (PreToolUse / Post lifecycle,
        duplicate / unknown ids, malformed bodies, drain-on-close).
      - 7 new tests for the codex sniff driving real JSON-RPC
        notifications through real `attachCodexBusySniff` (tool types
        bump, non-tool types don't, sweep on turn/completed, explicit
        drain).
      - 6 new tests for `prepareClaudeSettings` (create, merge, stale-
        entry purge, corrupt-JSON refusal, idempotent restore, `.claude/`
        preservation when other files live there).
      - 16 new tests for `BusySignal` upgrades (per-source counters,
        max-age timers, default-source fallback, forceFinishAll
        diagnostics, TraceHost teardown integration).
      - Test count: 220 → 258 (+38).

Signed-off-by: Andrew Jon Przybilla <andrew@przy.email>
@andrew-jon-p7a andrew-jon-p7a merged commit 4df8cfa into main May 13, 2026
1 check passed
@andrew-jon-p7a andrew-jon-p7a deleted the feat/scoped-mitm-plus-trace-work branch May 13, 2026 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant