Skip to content

feat: model Codex subagent result events#247

Merged
wesm merged 7 commits intowesm:mainfrom
tpn:tpn/233-codex-subagent-v2
Mar 28, 2026
Merged

feat: model Codex subagent result events#247
wesm merged 7 commits intowesm:mainfrom
tpn:tpn/233-codex-subagent-v2

Conversation

@tpn
Copy link
Copy Markdown
Contributor

@tpn tpn commented Mar 28, 2026

Closes #233.
Supersedes #242.

Summary

  • model Codex subagent terminal status as first-class tool_result_events in SQLite and PostgreSQL
  • keep tool_calls.result_content as a derived/latest compatibility summary instead of the canonical storage shape
  • render the latest summary by default and expose full chronological subagent result history in the tool block UI
  • preserve event history across full resyncs, PG pushes, and orphan-copy recovery

How This Differs From #242

#242 tried to solve Codex subagent workflows inside the existing single-result model by:

  • rebinding wait / <subagent_notification> output into one mutable tool_calls.result_content
  • synthesizing tool-result-like transcript rows to preserve some ordering information
  • adding parser heuristics to decide which source should "own" the final blob

That approach improved the raw JSON problem, but it kept running into ownership and chronology edge cases because Codex subagent status is not actually a single final blob.

This PR switches to a different model:

  • add additive tool_result_events tables in both SQLite and PostgreSQL
  • attach chronological result events directly to the originating tool call
  • derive tool_calls.result_content from those events for compatibility/search/compact UI
  • keep the existing subagent session linkage on spawn_agent

Practical effect:

  • chronology is preserved without retroactively mutating older transcript rows
  • wait output and fallback notifications are stored as canonical event history
  • PG-backed multi-host views and local SQLite views share the same result-event model

Details

  • spawn_agent remains a Task tool call and still maps to subagent_session_id
  • terminal wait output becomes tool_result_events
  • terminal <subagent_notification> messages also become tool_result_events
  • blocked categories are enforced server-side for event history as well as the compact summary
  • orphan-copy / resync paths preserve stored event history for sessions whose source files no longer exist

Upgrade / Rollout

  • SQLite: additive schema change plus dataVersion bump
  • PostgreSQL: additive schema change, no drop/recreate required
  • exporters should upgrade and run agentsview pg push --full once, or let the normal full-sync path re-export after resync

Known limit:

  • sessions whose original Codex JSONL source is already gone can only preserve event history that was already stored in SQLite; they cannot derive brand new history from missing raw files

Test Plan

  • CC=/usr/bin/gcc CXX=/usr/bin/g++ CGO_ENABLED=1 go test -tags fts5 ./internal/parser ./internal/sync ./internal/db ./internal/postgres -count=1
  • cd frontend && npm test -- --run src/lib/components/content/ToolBlock.test.ts src/lib/utils/messages.test.ts

Local Review

  • local codex, claude-code, and gemini review runs covered both default and security
  • the first v2 pass surfaced issues around stale tool_result_events, blocked-category history leakage, and same-ordinal orphan chronology
  • those follow-up fixes are included in this branch

@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Mar 28, 2026

roborev: Combined Review (6613bf1)

Verdict: Changes are generally sound, but there are 2 medium-severity issues that should be addressed before merge.

Medium

  • Incremental Codex parsing may now fail without a verified full-parse fallback

    • Location: /home/roborev/.roborev/clones/wesm/agentsview/internal/parser/codex.go#L1063
    • Problem: ParseCodexSessionFrom now returns errCodexIncrementalNeedsFullParse for common Codex subagent-related events (wait, spawn_agent output, terminal notifications), but the patch does not show caller-side handling or test coverage proving the sync/import path falls back to a full reparse. If incremental parser errors are currently treated as hard failures, Codex sessions may stop updating once one of these events appears.
    • Suggested fix: Handle this sentinel explicitly at the incremental parse call site by triggering a full session reparse, and add an end-to-end sync/import test covering that fallback path.
  • wait result event ordering is non-deterministic

    • Location: /home/roborev/.roborev/clones/wesm/agentsview/internal/parser/codex.go#L249
    • Problem: status.Map() iterates over a Go map, so if a wait result includes multiple completed agents, appended ResultEvents can be emitted in random order. That makes event_index assignment unstable across resyncs and can produce inconsistent UI ordering.
    • Suggested fix: Iterate with gjson.Result.ForEach so events preserve the original JSON object order.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@tpn
Copy link
Copy Markdown
Contributor Author

tpn commented Mar 28, 2026

Addressed the current roborev findings in .

Changes:

  • made the Codex incremental full-parse fallback explicit at the sync call site and added an end-to-end incremental sync test that exercises the fallback path
  • changed multi-agent result extraction to preserve JSON object order, so / are deterministic across reparses

Re-verified with:

  • ok github.com/wesm/agentsview/internal/parser 1.397s
    ok github.com/wesm/agentsview/internal/sync 5.139s
    ok github.com/wesm/agentsview/internal/db 6.998s
    ok github.com/wesm/agentsview/internal/postgres 0.056s

agent-session-viewer-frontend@0.1.0 test
vitest run --run src/lib/components/content/ToolBlock.test.ts src/lib/utils/messages.test.ts

RUN v4.0.18 /home/trent/src/agentsview/frontend

✓ src/lib/utils/messages.test.ts (13 tests) 2ms
✓ src/lib/components/content/ToolBlock.test.ts (14 tests) 36ms

Test Files 2 passed (2)
Tests 27 passed (27)
Start at 21:22:57
Duration 755ms (transform 341ms, setup 0ms, import 451ms, tests 37ms, environment 434ms)

The local 3-agent roborev hook on this commit came back with no medium-or-higher findings.

@tpn
Copy link
Copy Markdown
Contributor Author

tpn commented Mar 28, 2026

Addressed the current roborev findings in 4ccd7af.

Changes:

  • made the Codex incremental full-parse fallback explicit at the sync call site and added an end-to-end incremental sync test that exercises the fallback path
  • changed multi-agent wait result extraction to preserve JSON object order, so result_events and event_index are deterministic across reparses

Re-verified with:

  • CC=/usr/bin/gcc CXX=/usr/bin/g++ CGO_ENABLED=1 go test -tags fts5 ./internal/parser ./internal/sync ./internal/db ./internal/postgres -count=1
  • cd frontend && npm test -- --run src/lib/components/content/ToolBlock.test.ts src/lib/utils/messages.test.ts

The local 3-agent default,security roborev hook on this commit came back with no medium-or-higher findings.

@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Mar 28, 2026

roborev: Combined Review (4ccd7af)

Verdict: No medium-or-higher issues found across the reviewed outputs.

All three reviews agree this PR is clean at Medium, High, and Critical severity.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

Copy link
Copy Markdown
Owner

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth looking at this with Claude and Codex for feedback, here is where we landed

Review: Codex subagent approach

The tool_result_events model and "history" panel are well-designed — good foundation for representing Codex's inline subagent data. One issue to address before merging:

Bug: SubagentSessionID links to nonexistent sessions

The parser sets subagent_session_id = "codex:{agent_id}" on spawn_agent tool calls (internal/parser/codex.go:244), and ToolBlock.svelte renders SubagentInline for any tool call with that ID. But no code on this branch creates codex:{agent_id} session rows in the DB. In practice, Codex tool blocks will show a "Subagent session" affordance that fails on expand.

Fix: Remove the annotateSubagentSessions call from ParseCodexSession and ParseCodexSessionFrom. Codex subagents don't have their own session files — there's nothing to link to. The "history" panel already provides the right display for this data.

Why not synthesize virtual child sessions?

Codex subagent data is inherently limited compared to Claude's. You only get:

  • The spawn_agent prompt/description
  • Periodic <subagent_notification> status blurbs
  • Terminal status from wait output

You don't get the subagent's actual conversation turns, tool calls, or intermediate reasoning — Codex doesn't write that to disk. Synthesizing fake session rows would produce nearly-empty child sessions (a prompt and a terminal status), which would look broken in a different way.

Minor corrections

  • The incremental fallback doesn't force a full parse for all sessions that have subagents. It only triggers when the newly appended chunk contains a wait call, subagent function output, or a terminal <subagent_notification> (codexIncrementalNeedsFullParse in codex.go).

Summary

The tool_result_events infrastructure and history UI are solid — merge after removing the SubagentSessionID annotation for Codex so SubagentInline doesn't render a broken affordance. A purpose-built inline component (timeline/log view instead of nested session) could be a nice follow-up if the generic history panel feels too plain.

// trigger a non-destructive re-sync (mtime reset + skip cache
// clear) so existing session data is preserved.
const dataVersion = 6
const dataVersion = 8
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be 7?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I bet that's an artifact of it doing v2 against the v1 pass where it initially bumped to 7. Good catch!

@tpn
Copy link
Copy Markdown
Contributor Author

tpn commented Mar 28, 2026

Sounds good to me, I'll tackle that last piece in a bit.

@tpn
Copy link
Copy Markdown
Contributor Author

tpn commented Mar 28, 2026

Addressed Wes's feedback in af10b67.

Changes:

  • removed the Codex-only annotateSubagentSessions(...) linkage so spawn_agent tool calls no longer advertise a nonexistent child session
  • kept the inline result-event/history model as the sole Codex subagent affordance
  • updated the Codex parser tests accordingly

Re-verified with:

  • CC=/usr/bin/gcc CXX=/usr/bin/g++ CGO_ENABLED=1 go test -tags fts5 ./internal/parser ./internal/sync ./internal/db ./internal/postgres -count=1

The local 3-agent default,security roborev hook on this commit came back clean.

@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Mar 28, 2026

roborev: Combined Review (af10b67)

Summary verdict: One medium-severity issue should be addressed before merge.

Medium

  • internal/parser/codex.go:242, internal/parser/codex.go:887
    wait/subagent outputs that report only running status are silently discarded. codexTerminalSubagentEvent() only recognizes completed and errored, and both handleFunctionCallOutput() and handleSubagentNotification() depend on it. A wait call that times out or otherwise returns a nonterminal "still running" state can therefore produce no result_events and no synthesized result_content, leaving the UI to display the tool call as if it had no output.
    Suggested fix: preserve nonterminal states as result events, or at minimum fall back to storing/rendering the raw wait output when no terminal event is extracted. Add coverage for running-only wait outputs and notifications.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@tpn
Copy link
Copy Markdown
Contributor Author

tpn commented Mar 28, 2026

Addressed the dataVersion review note in b05a3cb.

Changes:

  • corrected internal/db/db.go from dataVersion = 8 to dataVersion = 7
  • updated the migration test to use dataVersion - 1 instead of a stale hardcoded literal

Re-verified with:

  • CC=/usr/bin/gcc CXX=/usr/bin/g++ CGO_ENABLED=1 go test -tags fts5 ./internal/db ./internal/sync ./internal/postgres -count=1

The local 3-agent default,security roborev hook on this commit came back clean.

@tpn
Copy link
Copy Markdown
Contributor Author

tpn commented Mar 28, 2026

Addressed the latest roborev finding in b596371.

Changes:

  • preserved Codex running states as real result events instead of dropping them
  • updated the parser coverage so running-only wait output is retained, running notifications coexist with later completion, and running notifications now intentionally trigger the full-parse fallback path

Re-verified with:

  • CC=/usr/bin/gcc CXX=/usr/bin/g++ CGO_ENABLED=1 go test -tags fts5 ./internal/parser ./internal/sync ./internal/db ./internal/postgres -count=1

The local 3-agent default,security roborev hook on this commit came back with no medium-or-higher findings.

@wesm
Copy link
Copy Markdown
Owner

wesm commented Mar 28, 2026

I took a look at this locally and it LGTM

@wesm wesm merged commit c6c9afd into wesm:main Mar 28, 2026
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: model Codex subagent workflows instead of showing raw subagent_notification JSON

2 participants