Skip to content

feat: add Codex CLI agent integration#772

Open
Soph wants to merge 16 commits intomainfrom
soph/codex-agent-integration
Open

feat: add Codex CLI agent integration#772
Soph wants to merge 16 commits intomainfrom
soph/codex-agent-integration

Conversation

@Soph
Copy link
Collaborator

@Soph Soph commented Mar 25, 2026

Summary

Adds agent integration for OpenAI Codex CLI, supporting lifecycle hooks, transcript analysis, token tracking, and prompt extraction.

What's implemented

  • Agent package (cmd/entire/cli/agent/codex/) with all core + optional interfaces:
    • Agent, HookSupport, HookResponseWriter — lifecycle hooks via .codex/hooks.json
    • TranscriptAnalyzer — extracts modified files from apply_patch tool calls in rollout JSONL
    • TokenCalculator — reads cumulative token usage from event_msg.token_count entries
    • PromptExtractor — extracts user prompts from response_item messages with role: "user"
  • Hook events: SessionStart → SessionStart, UserPromptSubmit → TurnStart, Stop → TurnEnd, PreToolUse → pass-through (no lifecycle action)
  • Config setup (entire enable --agent codex):
    • .codex/hooks.json — per-repo lifecycle hooks
    • .codex/config.toml — per-repo features.codex_hooks = true (feature is still under development in Codex)
    • ~/.codex/config.toml — project trust entry (required for Codex to load project-level config)
  • E2E runner (e2e/agents/codex.go) with isolated CODEX_HOME, auth symlink, model pinning, startup dialog dismissal
  • 43 unit tests across 4 test files
  • Research one-pager at cmd/entire/cli/agent/codex/AGENT.md

What's missing (blocked on Codex side)

Gap Reason Impact
No SessionEnd hook Codex doesn't fire a hook when a session terminates, only at end-of-turn (Stop) Framework handles gracefully — session cleanup is less precise but functional
PostToolUse PR still in review on Codex side Can't track tool outputs; no SubagentAwareExtractor possible yet
No subagent hooks No PreTask/PostTask equivalent Can't track spawned subagents
codex_hooks feature flag Default off (stage: UnderDevelopment) entire enable writes it to project config + trusts the project, but if the feature graduates to stable this config becomes unnecessary
PreToolUse is shell-only Only fires for Bash tool, not MCP/other tools File modification detection relies on transcript parsing (apply_patch), not hooks
TranscriptPreparer Not needed — Codex materializes rollout before firing hooks Confirmed by reading hook_transcript_path()ensure_rollout_materialized() in Codex source
TextGenerator No codex --print equivalent Can't generate AI-powered metadata (trail titles, summaries)

Test plan

  • mise run test — all unit tests pass (43 new tests)
  • mise run lint — clean (only pre-existing ireturn issues in capabilities.go)
  • mise run test:e2e --agent codex TestSingleSessionManualCommit — hooks fire, checkpoint created
  • mise run test:e2e --agent codex TestInteractiveMultiStep — interactive session with prompt pattern
  • mise run test:e2e:canary — Vogon canary unaffected

🤖 Generated with Claude Code


Note

Medium Risk
Adds a new built-in agent integration that installs hooks and writes Codex config/trust entries (including in ~/.codex/config.toml), which could affect local developer environments if the install/uninstall logic is wrong.

Overview
Adds a new built-in codex agent integration with lifecycle hook handling, transcript parsing, token usage calculation, and prompt extraction based on Codex JSONL rollout files.

Implements hook installation/uninstallation for .codex/hooks.json, plus automatic enabling of Codex’s codex_hooks feature in project .codex/config.toml and adding a per-repo trust entry to the user’s ~/.codex/config.toml. Registers the new agent in the registry/CLI imports, adds Codex-specific E2E harness support, and includes a Codex integration one-pager and unit tests covering hooks, lifecycle parsing, and transcript analysis.

Written by Cursor Bugbot for commit 81d10dc. Configure here.

Soph and others added 14 commits March 25, 2026 10:39
Phase 1 of agent integration: research Codex CLI's hook mechanism,
transcript format, and configuration. Codex supports SessionStart,
UserPromptSubmit, Stop, and PreToolUse (shell only) hooks via
hooks.json config files with JSON stdin/stdout transport.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 719dd8da5a34
Add full agent integration for OpenAI's Codex CLI:

- Agent package (cmd/entire/cli/agent/codex/) implementing Agent,
  HookSupport, and HookResponseWriter interfaces
- Hook support for SessionStart, UserPromptSubmit, Stop events via
  .codex/hooks.json config files
- E2E test runner (e2e/agents/codex.go) for running Codex in E2E tests
- Agent registered as "codex" with type "Codex"
- 21 unit tests covering lifecycle parsing, hook install/uninstall,
  idempotency, config preservation, and interface compliance

Codex hooks use the same JSON stdin/stdout transport as Claude Code.
PreToolUse is pass-through (shell only, no lifecycle action needed).
No SessionEnd hook — handled gracefully by the framework.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 410f22073a20
The codex_hooks feature is default_enabled: false (UnderDevelopment stage)
in Codex CLI. Without --enable codex_hooks, hooks.json is completely
ignored and no lifecycle hooks fire. This caused the E2E test to fail
because no session was ever started.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 7544b7fbb788
Instead of passing --enable codex_hooks to every codex invocation,
InstallHooks now writes features.codex_hooks = true to .codex/config.toml
in the repo root. This is per-repo only — doesn't affect other projects.

The E2E runner no longer needs the --enable flag since entire enable
handles the feature activation through the config file.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: e81dffce2554
Project-level .codex/config.toml triggers Codex's trust system: if the
project isn't trusted, the entire config layer is disabled — which also
prevents hooks.json from being discovered. This is why hooks weren't
firing in E2E tests.

Changes:
- ensureFeatureEnabled now writes to ~/.codex/config.toml (user-level)
  which is always loaded regardless of project trust
- E2E runner uses --enable codex_hooks CLI flag since tests run in
  isolated temp dirs that shouldn't modify the real user config
- Removed --skip-git-repo-check (E2E test repos are real git repos)
- Added test for preserving existing user config

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 030a5eaac016
Codex has a project trust system: untrusted projects have their .codex/
config layer disabled, which prevents hooks.json from being discovered.
InstallHooks now writes a trust entry for the repo to ~/.codex/config.toml:

  [projects."/path/to/repo"]
  trust_level = "trusted"

This is alongside the codex_hooks feature flag. Both are written to the
user-level config because project-level config.toml itself requires trust
to be loaded (chicken-and-egg problem).

E2E runner still uses --enable codex_hooks CLI flag for isolation (avoids
modifying the real ~/.codex/config.toml during tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 081b8b2da476
Trust in ~/.codex/config.toml unlocks the project layer, so the
codex_hooks feature flag can live in .codex/config.toml (per-repo):

  .codex/hooks.json      — Entire lifecycle hooks (per-repo)
  .codex/config.toml     — features.codex_hooks = true (per-repo)
  ~/.codex/config.toml   — projects."<path>".trust_level = "trusted"

E2E runner creates an isolated CODEX_HOME per test with trust +
feature flag pre-seeded via seedCodexHome(), exercising the real
config discovery path instead of --enable CLI bypass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 302b87bb7016
The isolated CODEX_HOME didn't have auth credentials, causing 401
Unauthorized errors. Symlink auth.json from ~/.codex/ so OAuth/token
auth works alongside OPENAI_API_KEY env var auth.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 37e018398c55
Codex shows an interactive model upgrade dialog when a new model is
available. Pin model in config to avoid it, and add a dismissal loop
for any remaining startup dialogs in interactive mode.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: d6053da8bf90
Codex uses › (single right-pointing angle quotation mark) not >
(greater-than sign) as its input prompt indicator.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 60bf4d6f83d3
These were accidentally lost during a git stash/checkout cycle earlier
in this branch. Restoring from origin/main.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: f1c4e805406d
Fixes from /simplify review:
- Fix CLAUDE_PROJECT_DIR copy-paste bug in localDev hook commands;
  use git rev-parse --show-toplevel like other non-Claude agents
- Extract resolveCodexHome() to deduplicate CODEX_HOME resolution
  between GetSessionDir and ensureProjectTrusted
- Use cmdPrefix variable to reduce hook command string repetition
- Fix StartSession env isolation: pass unsetEnv list to NewTmuxSession
  instead of unused filterEnv result
- Eliminate duplicate args/displayArgs construction in E2E runner
- Make Bootstrap fail-fast on CI when OPENAI_API_KEY is missing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: a156175f32b7
…for Codex

Parse Codex's rollout JSONL format to extract:

- Modified files: from apply_patch custom_tool_call entries
  (*** Add File: / *** Update File: / *** Delete File: patterns)
- Token usage: from event_msg token_count entries with
  total_token_usage (input, cached, output, reasoning tokens)
- User prompts: from response_item messages with role="user"

19 new tests covering all three interfaces with offset support,
edge cases, and the apply_patch file extraction regex.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 8dfb5f49acdf
ReadSession now extracts modified files from the rollout JSONL by
reusing extractFilesFromLine (parses apply_patch entries). Previously
ModifiedFiles was always empty.

Also confirmed TranscriptPreparer is not needed: Codex calls
ensure_rollout_materialized() before passing transcript_path to hooks,
so the rollout file is always fully written when our Stop hook fires.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 5ba29671f33f
Copilot AI review requested due to automatic review settings March 25, 2026 15:16
@Soph Soph requested a review from a team as a code owner March 25, 2026 15:16
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 4 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class integration for the OpenAI Codex CLI agent to Entire’s agent framework, including hook installation, lifecycle event parsing, transcript analysis (modified files + prompts), and token usage extraction, plus E2E support and research notes.

Changes:

  • Register the new codex agent in CLI wiring and the agent registry.
  • Add Codex agent implementation (hooks install/uninstall, lifecycle parsing, transcript parsing, token usage & prompt extraction) with unit tests and a one-pager.
  • Add an E2E runner for Codex and a shell probe script for manual validation of hook payloads.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
scripts/test-codex-agent-integration.sh Adds a local probe script that installs capture hooks and inspects captured payloads.
e2e/agents/codex.go Adds an E2E agent runner that seeds isolated Codex home/config and runs Codex in exec/tmux modes.
cmd/entire/cli/hooks_cmd.go Ensures the Codex agent is registered for hook subcommands.
cmd/entire/cli/config.go Ensures the Codex agent is registered for config/enable flows.
cmd/entire/cli/agent/registry.go Adds Codex agent name/type constants.
cmd/entire/cli/agent/codex/* Implements Codex agent integration (hooks, lifecycle parsing, transcript parsing, tests, and documentation).
cmd/entire/cli/agent/capabilities.go Adjusts capability type-assertion helpers (but currently risks lint failures).

Soph and others added 2 commits March 25, 2026 16:27
- Fix InstallHooks dropping non-hooks top-level keys from hooks.json
  (now preserves $schema and future fields via round-trip)
- Fix AreHooksInstalled using Claude-specific CLAUDE_PROJECT_DIR;
  now uses isEntireHook() which matches both production and localDev
  prefixes
- Fix CalculateTokenUsage returning cumulative totals instead of
  per-checkpoint delta (now subtracts baseline at/before offset)
- Fix ReadSession not deduplicating ModifiedFiles
- Fix ResolveSessionFile returning bare session ID; now handles both
  absolute paths (from hooks) and fallback path construction
- Fix InstallHooks early return skipping feature flag + trust setup
- Fix StartSession not forwarding OPENAI_API_KEY/HOME/TERM to tmux
- Make seeded model configurable via E2E_CODEX_MODEL env var
- Handle filepath.Abs errors in E2E runner
- Remove local filesystem paths from AGENT.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: f3f2130088d1
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 37f449ec7d97
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants