Skip to content

Codex CLI integration: hook adapter, integration folder, and managed-prefs hardening#51

Open
z26zheng wants to merge 3 commits into
codex-integration-docsfrom
codex-integration-implementation
Open

Codex CLI integration: hook adapter, integration folder, and managed-prefs hardening#51
z26zheng wants to merge 3 commits into
codex-integration-docsfrom
codex-integration-implementation

Conversation

@z26zheng

Copy link
Copy Markdown
Collaborator

Stacked on top of #50 (the planning docs). Once #50 lands on `main`, the base of this PR becomes `main`.

Summary

Implements the Codex CLI integration described in the planning docs in #50. Three logical commits:

  1. `feat(hook):` add Codex client kind and output format to permit0 hook adapter — implements `--client codex` for the `permit0 hook` subcommand: empty stdout = allow/defer, JSON deny envelope for Deny/HITL, fail-closed on any internal error. 7 new integration tests cover all wire shapes including the fail-closed paths.

  2. `feat(integrations):` add `permit0-codex` with examples, demo rig, and safe managed-prefs installer — user-facing home for the integration at `integrations/permit0-codex/`. Three artifacts: paste-ready `examples/` (TOML, JSON, macOS installer), a runnable `dev-test-rig/` (instrumented hook wrapper, mock-gmail MCP server, demo launcher, live event watcher), and corresponding README walkthroughs.

  3. `docs(codex):` add packaging plan and post-implementation reviews — `07-packaging.md` scoping the safety hardening, plus reviews `9be273b5/` and `a7ebc31f/07-08-*` that drove it.

What this changes for end users

Before this PR After this PR
No way to gate Codex tool calls via permit0 `permit0 hook --client codex` is a first-class option
No setup recipe Copy-paste `config.toml` snippet or run `install-managed-prefs.sh`
No reproducible demo `bash integrations/permit0-codex/dev-test-rig/codex-demo` opens a full live demo with mock MCP

Safety story (why commit 2 looks longer than it needs to)

The macOS installer writes to `com.openai.codex/requirements_toml_base64` — the same defaults key an enterprise MDM uses to manage Codex policy. A naive installer would silently overwrite an org's managed config the moment anyone ran the demo. This PR ships the hardened version from day one:

  • Every write stamps the TOML body with `# permit0-managed: installed by integrations/permit0-codex`.
  • The installer refuses to overwrite an unstamped value unless `--force` is passed, and writes a timestamped backup to `~/.permit0/managed-prefs-backup-.toml` before any overwrite.
  • `--uninstall` restores the most recent backup by mtime.
  • `dev-test-rig/cleanup` checks the same stamp before deleting; the demo launcher itself refuses to install over a non-permit0 value.

End-to-end verification

Manual run against Codex 0.130.0-alpha.5 on macOS (transcript in `docs/plans/codex-integration/06-real-codex-testing.md`):

  • ✅ `codex exec` with a Bash prompt fires the hook, permit0 defers (no Bash pack), Codex runs the command. Round-trip: 65ms.
  • ✅ `codex exec` asking for a gmail send to an external recipient fires the hook, permit0's Gmail pack scores it Medium (HITL), Codex receives a deny envelope and refuses the MCP call. The mock MCP server's log confirms `tools/call` was never reached — permit0 stopped the chain at `PreToolUse`.
  • ✅ The hook hot path stays under 100ms end-to-end.

Live-fire diagnostics (per-invocation `stdin.json`/`stdout`/`stderr` plus a JSONL `events.log`) are written under `/tmp/permit0-codex-test/` so a reviewer can reproduce any decision after the fact.

Test plan

  • `cargo fmt --all --check` clean
  • `cargo clippy --all-targets -- -D warnings` clean
  • `cargo test --workspace --exclude permit0-py` — 640 tests pass across 15 binaries
  • `scripts/test-codex-hook.sh` — 9/9 synthetic scenarios pass
  • `scripts/test-managed-prefs-roundtrip.sh` — 11/11 assertions across 4 phases pass (skips gracefully on Linux)
  • Live `codex exec` smoke from a clean `CODEX_HOME`: hook fires for Bash, denies gmail-to-external
  • Reviewer: run `bash integrations/permit0-codex/dev-test-rig/codex-demo` to reproduce the live demo end-to-end (requires Codex installed)

Files in this PR

```
crates/permit0-cli/src/cmd/hook.rs 1217 ++++++++++++++++++++++++++++++---
crates/permit0-cli/src/main.rs 33 +-
crates/permit0-cli/tests/cli_tests.rs 310 +++++++++
scripts/test-codex-hook.sh 186 +++++
scripts/test-managed-prefs-roundtrip.sh 231 ++++++++++++
integrations/permit0-codex/** 1303 ++++++++++++++++++++++++++++++++
integrations/README.md 12 +-
docs/plans/codex-integration/03-configuration.md 8 +
docs/plans/codex-integration/07-packaging.md 255 +++++++++++++
docs/plan-reviews/codex-integration/9be273b5/** 127 +++++
docs/plan-reviews/codex-integration/a7ebc31f/07-* 222 +++++++++++
docs/plan-reviews/codex-integration/a7ebc31f/08-* 301 +++++++++++++
docs/plan-reviews/codex-integration/a7ebc31f/00-summary.md ±±
```

🤖 Co-authored-by: Cursor AI agent

Made with Cursor

z26zheng and others added 3 commits May 10, 2026 16:15
…dapter

Extends `permit0 hook` so that `--client codex` produces output that
satisfies Codex's PreToolUse hook contract: empty stdout = no objection,
or a JSON deny envelope with `permissionDecision: "deny"`. The Codex
format never emits `"allow"` or `"ask"` — both are explicitly rejected
by Codex.

Behavior:
- Codex MCP tools arrive as `mcp__<server>__<tool>` (same as Claude
  Code); the existing prefix-stripping path is shared.
- HumanInTheLoop verdicts map to a Codex deny envelope with a
  ` — requires human review` marker so the model and the user can
  tell HITL apart from a hard Critical block.
- All internal errors (deserialization failures, remote-daemon
  unreachable, normalizer panics) are caught at the outer `run()`
  layer and converted to a deny envelope. Fail-closed is critical for
  Codex because empty stdout = allow.
- Optional Codex stdin fields (session_id, turn_id, cwd, model,
  hook_event_name, tool_use_id, transcript_path) are deserialized
  with `#[serde(default)]` so a minimal Claude-shaped payload still
  parses, and they're added to `RawToolCall.metadata` for forensic
  auditing in local mode.

`OutputFormat::from_client(ClientKind)` centralizes the
client → wire-protocol mapping so callers never have to choose the
wire format independently from the client kind.

Test coverage:
- 7 new integration tests in `crates/permit0-cli/tests/cli_tests.rs`
  cover empty stdout for defer, deny envelope shape, malformed/empty
  stdin failing closed, remote-daemon failure failing closed, shadow
  mode emitting empty stdout, minimal-payload acceptance, and the
  invariant that `permissionDecision: "allow"` is NEVER emitted.
- `scripts/test-codex-hook.sh` runs the same code path against the
  release binary in 9 canned scenarios as a smoke test.

Co-authored-by: Cursor <cursoragent@cursor.com>
…fe managed-prefs installer

User-facing home for the Codex integration. `integrations/permit0-codex/`
ships three things:

1. **examples/** — paste-ready snippets for two install paths
   - `config.toml.example` for the interactive `[hooks.PreToolUse]`
     trust flow (requires `/hooks` review in the TUI once)
   - `hooks.json.example` for the same in JSON form
   - `install-managed-prefs.sh` for the macOS unattended path that
     writes to `com.openai.codex/requirements_toml_base64` (treated by
     Codex as `legacy_managed_config_mdm` → auto-trusted)

2. **dev-test-rig/** — runnable end-to-end demo against a live Codex
   install
   - `codex-demo` launches an isolated CODEX_HOME with the hook
     installed via managed prefs and a mock-gmail MCP server wired in
   - `wrap-permit0.sh` is the instrumented hook entrypoint that
     captures `inv-<id>/{stdin.json,stdout,stderr,env}` plus a JSONL
     `events.log` row per fire
   - `watch` tails events.log with colored ALLOW/DEFER vs DENY rows
   - `cleanup` removes the managed-prefs hook
   - `mock-gmail-mcp.py` is a 150-line stdio MCP server exposing a
     fake `gmail_send` tool so the Gmail pack normalizer can be
     exercised without real credentials

The `install-managed-prefs.sh` and `dev-test-rig/cleanup` scripts both
write to `requirements_toml_base64` — the same slot enterprise MDMs use
for Codex policy. They are designed to refuse to clobber a non-permit0
value:

- Installer stamps its TOML body with
  `# permit0-managed: installed by integrations/permit0-codex`
  and refuses to overwrite an unstamped existing value without
  `--force`. With `--force` it writes a timestamped backup to
  `~/.permit0/managed-prefs-backup-<TS>.toml` first.
- `--uninstall` restores the most recent backup by mtime.
- `dev-test-rig/cleanup` checks the same stamp before deleting; the
  demo launcher itself refuses to install over a non-permit0 value.

`scripts/test-managed-prefs-roundtrip.sh` (208 lines, 11 assertions
across 4 phases) verifies all of this against the real macOS defaults
and restores any pre-existing developer state via an EXIT trap.

Documentation:
- `integrations/permit0-codex/README.md` is the user-facing setup
  recipe for both install paths
- `integrations/permit0-codex/dev-test-rig/README.md` is the demo
  walkthrough (prompts to try, forensics tour)
- `integrations/README.md` adds Codex to the CLI-hook integrations
  table
- `docs/plans/codex-integration/03-configuration.md` adds a pointer at
  the top to the integrations folder

Co-authored-by: Cursor <cursoragent@cursor.com>
After the implementation work landed, a second pass of plan reviews
identified a real safety bug: `install-managed-prefs.sh` and
`dev-test-rig/cleanup` were unconditionally writing to
`com.openai.codex/requirements_toml_base64`, which would silently
clobber any enterprise-MDM-installed Codex policy.

This commit adds:

- `docs/plans/codex-integration/07-packaging.md` — hardening plan that
  scopes the safety fix: stamp/detect/backup/uninstall pattern for the
  installer, stamp-check for the cleanup, Codex binary path fallback
  in the demo launcher, and a round-trip integration test. Drops
  redundant items that were already done (link from `03-configuration.md`,
  integrations table update).

- `docs/plan-reviews/codex-integration/9be273b5/` — independent review
  that originally flagged the safety bug, including its read of
  `06-real-codex-testing.md` and `07-packaging.md`.

- `docs/plan-reviews/codex-integration/a7ebc31f/07-review-real-codex-testing.md`
  and `08-review-packaging.md` — reviewer a7ebc31f's second-pass reviews
  of the post-implementation plans (06 and 07).

- `docs/plan-reviews/codex-integration/a7ebc31f/00-summary.md` — updated
  summary covering all eight reviewed plan docs.

The implementation that satisfies the 07-packaging plan ships in the
prior commit on this branch (the safety-hardened
`install-managed-prefs.sh`, `cleanup`, `codex-demo`, and round-trip
test).

Co-authored-by: Cursor <cursoragent@cursor.com>
@z26zheng z26zheng requested a review from AnissL93 as a code owner May 10, 2026 23:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant