Skip to content

fix(p9): watcher folds gh exit into state; +abandon, +cleanup, --background alias#3

Merged
broomva merged 1 commit into
mainfrom
feat/watcher-fold-state
May 5, 2026
Merged

fix(p9): watcher folds gh exit into state; +abandon, +cleanup, --background alias#3
broomva merged 1 commit into
mainfrom
feat/watcher-fold-state

Conversation

@broomva
Copy link
Copy Markdown
Owner

@broomva broomva commented May 5, 2026

Summary

Closes five bugs surfaced by a fresh-session test against this very primitive — proving the dogfood loop works as a feedback channel. The fresh agent followed AGENTS.md's reflexive rule, hit `unrecognized arguments: --background`, and from there exposed cascading state-machine drift.

Bugs closed

# Severity Description
B1 doc/code drift AGENTS.md reflexive rule says `p9 watch --background` but the engine only had `--dry-run`. Every session following the rule hit `error: unrecognized arguments`.
B2 architectural / silent state drop `cmd_watch` spawned `gh pr checks --watch` detached and returned immediately. State.jsonl stayed at WATCHING forever even after CI completed — direct violation of the never silently drop state cardinal rule.
B3 concurrency deadlock No terminal subcommand for PRs closed externally. With `max_concurrent_prs=1`, an orphan would deadlock all future watches.
B5 type error `load_policy` accepted `Path

(B4 — no `p9` shim in `$PATH` — is a separate UX PR, queued.)

Fixes

B2 — architectural fix (the real one)

`p9 watch` is now foreground-by-default:

  1. Spawn `gh pr checks --watch`.
  2. Append `PUSHED → WATCHING`.
  3. Block on subprocess.
  4. Fold exit code into a state event: `0 → WATCHING → GREEN`; non-zero → `WATCHING → RED_UNCLASSIFIED` (caller runs `p9 heal --classify` next).

The agent invokes `p9 watch` via `run_in_background`, so the bg-task notification fires when the whole watch+fold completes — which is what the cardinal protocol actually wants. The bg-wrapper is the agent's harness; `p9 watch` no longer second-guesses with its own detach.

Old fire-and-forget behavior preserved via new `--detach` flag for callers that genuinely want it.

B1 — flag aliases

`--background` and `--block` added as aliases for the (now-default) foreground behavior. Historic AGENTS.md guidance keeps working without surprise.

B3 — terminal subcommands

  • `p9 abandon [--reason TEXT]` — idempotent on already-terminal states; errors cleanly on unknown PRs.
  • `p9 cleanup` — polls `gh pr view` for every open row; PRs reported as MERGED or CLOSED get a terminal ABANDONED event with reason. Still-OPEN PRs are left alone. Failed gh queries are reported but not abandoned (no false-positive drain).

State-machine table extended: `ABANDONED` now reachable from PUSHED, GREEN, MERGE_READY (already had WATCHING/HEALING/RED_*). All terminal states remain absorbing.

B5 — `load_policy` accepts `str`

Coerce with `Path(path)` at the boundary.

Tests

25 new tests under `tests/test_p9_pr_e.py`:

  • Watch fold — green / red / dry-run / `--detach` / `--background` / `--block` aliases (6).
  • Abandon — open / unknown / idempotent on terminal / from-GREEN (4).
  • Cleanup — no-op / merged / closed / open / gh-failure (5).
  • State-machine transitions — parametrized over all 7 non-terminals → ABANDONED, terminal-to-ABANDONED illegal (8).
  • `load_policy` str + Path acceptance (2).

Total: 108 passing (was 83). ~15s.

🤖 Generated with Claude Code

…ground alias

Closes five bugs surfaced by a fresh-session test against this very
primitive — proving the dogfood loop works as a feedback channel.

B1 (doc/code drift): AGENTS.md reflexive rule references
`p9 watch <pr> --background`, but the engine only had `--dry-run`. Every
session following the rule hit `error: unrecognized arguments: --background`.
Fix: add `--background` and `--block` as aliases for the (now-default)
foreground behavior.

B2 (architectural — silent state drop): `cmd_watch` spawned
`gh pr checks --watch` detached and returned immediately. When the gh
subprocess exited, *nothing* wrote a state transition. State.jsonl stayed
at WATCHING forever even after CI completed — a direct violation of the
"never silently drop state" cardinal rule.

Fix: foreground-by-default. `p9 watch` blocks on the gh subprocess, then
folds the exit code into a state event:
  - exit 0 → WATCHING → GREEN
  - exit non-0 → WATCHING → RED_UNCLASSIFIED (caller runs
    `p9 heal --classify` next)

The agent invokes `p9 watch` via `run_in_background`, so the bg-task
notification fires when the *whole* watch+fold completes — which is what
the cardinal protocol actually wants. The bg-wrapper is the agent's
harness; `p9 watch` no longer second-guesses with its own detach. Old
behavior is preserved via the new `--detach` flag for fire-and-forget
callers.

B3 (no terminal subcommand for orphans): when a PR is closed via
`gh pr close` (or merged outside `p9 auto-merge`), the local state row was
orphaned. With `max_concurrent_prs=1`, that orphan would deadlock all
future watches.

Fix:
- `p9 abandon <pr> [--reason TEXT]` — explicit terminal transition
  (idempotent on already-terminal states; error on unknown PR).
- `p9 cleanup` — polls `gh pr view` for every open row; PRs that GitHub
  reports as MERGED or CLOSED get a terminal ABANDONED event (with
  reason). PRs still OPEN are left alone. Failed gh queries are
  reported but not abandoned (no false-positive drain).

State-machine table extended: ABANDONED now reachable from PUSHED,
GREEN, and MERGE_READY (already had WATCHING/HEALING/RED_*). All
terminal states remain absorbing.

B5 (`load_policy` crashes on str): the function signature claimed
`Path | None` but the body called `.exists()` directly without coercion.
A test caller `p9.load_policy('/some/path')` raised AttributeError.
Fix: accept `Path | str | None`; coerce with `Path(path)` at the
boundary.

Tests: 25 new under tests/test_p9_pr_e.py — watch fold (green/red/dry/
detach/--background/--block aliases), abandon (open/unknown/idempotent/
from-green), cleanup (no-op/merged/closed/open/gh-failure), state-
machine transitions (parametrized over all non-terminals → ABANDONED).
Total now 108 passing (~15s).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

Warning

Rate limit exceeded

@broomva has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a8115e51-307f-48a1-a322-efd0687adb3b

📥 Commits

Reviewing files that changed from the base of the PR and between a7bc1da and ad7ff2d.

📒 Files selected for processing (2)
  • scripts/p9.py
  • tests/test_p9_pr_e.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/watcher-fold-state

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@broomva broomva merged commit ca97733 into main May 5, 2026
4 checks passed
@broomva broomva deleted the feat/watcher-fold-state branch May 5, 2026 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant