fix(gate): address P1-P3 audit findings (May 13 2026) by openlawbot · Pull Request #34 · openlawteam/gate

openlawbot · 2026-05-13T17:32:11Z

Summary

Implements all six P1–P3 concerns from the May 13 2026 audit of activity.log + reviews.jsonl. Each concern was verified against the live logs and the current code before this PR.

Priority	Concern	Code surface
P1	TUI orphan log pane crash (`AttributeError: 'Vertical' object has no attribute 'previous_sibling'`, 40+ occurrences)	`tui.py`
P1	Quota 401 log storm (201 identical WARNs in 30 days)	`quota.py`
P2	Cancellation latency / superseded review tails	`orchestrator.py`, `runner.py`, `fixer.py`, `fixer_polish.py`
P2	`build.json` write race after worktree teardown (PR #318)	`orchestrator.py`
P2	Bot reviewing its own PRs (`Can not approve/request changes on your own pull request`, 4 occurrences)	`github.py`, `orchestrator.py`, `health.py`
P3	`fix_skipped` distinguished from `fix_failed` (PR #340 cooldown false-failure)	`schemas.py`, `logger.py`, `fixer.py`, `orchestrator.py`, `reports.py`, `tui.py`

What changed

P1.1 — TUI orphan log pane

gate/tui.py: replaced container.previous_sibling probe (not part of Textual's public Widget API) with panes_container.children index lookup. Handles the asymmetric first-pane case (no leading separator) by falling back to the trailing separator.

P1.2 — Quota 401 noise

gate/quota.py: new _auth_drift_marker_fresh() reuses the existing quota-auth-drift-alerted.txt 24h marker file. First 401 per window logs WARNING with a claude auth login --method browser hint; repeats log DEBUG. File-based so the gate survives process restarts.

P2.1 — Cancellation plumbing

gate/runner.py: run_with_retry(cancelled=) short-circuits between attempts and uses Event.wait(delay) for retry sleeps. StructuredRunner.run(cancelled=) switches to subprocess.Popen + a watchdog thread that calls proc.terminate() on cancel (worst-case cancel latency on a structured stage drops from structured_stage_s = 600 to ~0.5s).
gate/orchestrator.py: _spawn_and_wait_agent's time.sleep(5) → self._cancelled.wait(5). Both _run_agent_stage and _run_structured_stage thread cancelled=self._cancelled into run_with_retry.
gate/fixer.py: fix-rereview run_with_retry call gets cancelled=self._cancelled.
gate/fixer_polish.py: fix-polish self-audit run_with_retry call gets cancelled=pipeline._cancelled.

All new args default to None, preserving byte-identical legacy behavior for callers that don't opt in.

P2.2 — `build.json` race + misleading log message

gate/orchestrator.py: guarded the direct build.json write the same way _save_stage_result guards stage outputs. Added a try/except for the FileNotFoundError race after the guard runs but before the write completes.
Split the previously-misleading _save_stage_result log message: review cancelled for the cancel-flag path, workspace gone for the missing-workspace path. (The uniform "review cancelled" wording is what caused the audit to initially attribute the May 9 PR #326/#330 2-hour tails to cancellation — they were actually workspace-teardown races.)

P2.3 — Bot-PR routing

gate/github.py: new _is_own_pr(pr_author, config) helper. post_review(pr_author=) and approve_pr(pr_author=) short-circuit to comment_pr when the PR author matches config["repo"]["bot_account"]. The legacy stderr fallback is retained as belt-and-suspenders for the case where pr_author is unavailable (e.g. health.py orphan cleanup).
gate/orchestrator.py: threaded pr_author=self.pr_author, config=self.config through 5 approve_pr and 1 post_review call sites.
gate/health.py: passed config= to the orphan-cleanup approve_pr. pr_author is intentionally omitted (active_review.json doesn't persist it); the existing stderr fallback handles bot PRs in this rare recovery path.

P3.1 — `fix_skipped` vs `fix_failed`

gate/schemas.py: new FixResult.skipped: bool = False.
gate/logger.py: log_fix_result(status="skipped") writes decision: "fix_skipped". Decision lookup now uses a map. Updated docstring to document all four states.
gate/fixer.py: four cancellation/policy branches now return skipped=True (cancelled-before-start, cooldown/soft/lifetime limit, mid-iteration cancel, workspace-deleted). Real-failure branches (iteration-exhausted, crash) keep their existing skipped=False.
gate/orchestrator.py: new check-run branch emits neutral conclusion with "Gate Auto-Fix: skipped (<reason>)" for skipped results; log_fix_result call passes status="skipped".
gate/reports.py: fix_skipped added to _FIX_DECISIONS; excluded from the success-rate denominator alongside fix_no_op.
gate/tui.py: added "fix_skipped": "⚒○" icon + "dim" color to the decision maps.

Verification

pytest -q: 1274 passed (up from 1244; +30 new tests, +2 regression tests rewritten for the subprocess.run → Popen switch).
ruff check .: All checks passed.
mypy gate/<touched files>: 0 new errors. Pre-existing fixer_polish.py _polish_context baseline preserved (verified via git stash + re-run on main).

New tests by file

tests/test_runner.py (+7): run_with_retry cancellation across early-short-circuit / rate-limit-sleep / transient-sleep / legacy back-compat + StructuredRunner watchdog cancellation + post-completion no-op + no-watchdog when cancelled=None.
tests/test_logger.py (+2): status="skipped" mapping + no-regression on status="no_op".
tests/test_quota.py (+4): first 401 WARNs with re-auth hint, repeat 401 DEBUGs, post-cooldown re-WARN, non-auth failures unaffected.
tests/test_github.py (+5): approve_pr and post_review routing for bot author, human author, and empty author.
tests/test_reports.py (+3): fix_skipped excluded from rate, classified as fix-followup, mixed-outcome denominator.
tests/test_fixer.py (+5): each of the four cancellation/policy branches returns skipped=True; default skipped=False regression.
tests/test_orchestrator.py (+2): log-line split for cancel-set vs workspace-missing.

Explicit follow-ups (out of scope for this PR)

builder.run_build cancellation: a long npm ci still runs to completion before the new build.json guard fires. Threading cancelled into subprocess.run calls in builder.py is a separate change.
Replacement-review stages burning to 2400s (the actual cause of the May 9 PR #326 / #330 2-hour tails): empty result files at agent_stage_s deadline. Needs a separate investigation — distinct from the cancellation work this PR addresses.
Architecture + Security stage parallelization: P2 perf opportunity but not a correctness issue.
Anthropic usage API re-auth: manual ops task (claude auth login --method browser). The code-side noise suppression in P1.2 is the dev-side piece.

Test plan

Full pytest suite green (1274 / 1274)
Ruff clean
Mypy no new errors
Manual: trigger a supersede on a live PR; confirm the cancelled review's check-run flips to cancelled within seconds (was: up to ~5 min stacked retry sleeps).
Manual: run gate cancel --pr <N> mid-structured-stage; confirm claude subprocess is terminated.
Manual: with 3+ active reviews in the TUI, cancel one; confirm no Failed to remove log pane in activity.log and no flicker.
Manual: force a 401 (unset CLAUDE_CODE_OAUTH_TOKEN, clear keychain); confirm first review WARNs with re-auth hint, subsequent reviews DEBUG only.
Manual: bot opens a PR with a request_changes verdict; confirm no failed to create review: GraphQL: Review Can not request changes on your own pull request in activity.log; verdict appears as a PR comment.
Manual: trigger the fix pipeline twice in quick succession; confirm the cooldown-blocked second invocation logs decision: "fix_skipped" (not fix_failed) and the GitHub check-run shows neutral Gate Auto-Fix: skipped (Fix cooldown active …).

Rollback

All changes are backward-compatible. git revert of this commit fully restores prior behavior — new params (cancelled=, pr_author=) and new FixResult.skipped default to neutral values. Persisted state (state/, reviews.jsonl) is forward-compatible: a pre-PR gate process reading post-PR fix_skipped rows handles them via existing .get() defaults.

Made with Cursor

Six concerns from the May 13 log + code audit: - P1 TUI orphan log pane: replace `Vertical.previous_sibling` probe (not a Textual API) with `panes_container.children` index lookup. Removes the recurring `AttributeError` + rebuild-all-panes cascade observed 40+ times in `activity.log`. - P1 Quota 401 noise: gate the WARN/DEBUG decision on the existing `quota-auth-drift-alerted.txt` 24h marker. First 401 per window WARNs with a re-auth hint; repeats DEBUG. Was 201 identical WARNs in 30 days. - P2 Cancellation latency: thread `threading.Event` through `_spawn_and_wait_agent` (`Event.wait(5)` replaces `time.sleep(5)`), `run_with_retry` (interrupts exponential back-off sleeps), and `StructuredRunner.run` (`Popen` + watchdog terminates Claude on cancel). All four `run_with_retry` callers updated: orchestrator agent + structured stages, fixer fix-rereview, fixer_polish self-audit. Worst-case cancel latency drops from ~5min (stacked retry sleeps) to ~0s. - P2 build.json race: guard the direct write at orchestrator.py:433 the same way `_save_stage_result` guards stage outputs. Fixes the `FileNotFoundError` observed on PR #318. Split the misleading `_save_stage_result` log message: cancel-set logs "review cancelled", workspace-gone logs "workspace gone" (was uniformly "review cancelled", which mis-attributed the May 9 PR #326/#330 2-hour tails to cancellation). - P2 Self-PR routing: new `_is_own_pr(pr_author, config)` helper + `pr_author=` arg on `post_review` and `approve_pr`. Bot-authored PRs route directly to `comment_pr` instead of the GraphQL-rejected `gh pr review` round-trip. Wired through 5 `approve_pr` and 1 `post_review` call site in orchestrator + 1 `approve_pr` in health.py orphan cleanup. - P3 fix_skipped vs fix_failed: new `FixResult.skipped: bool`; `fix_skipped` decision in `log_fix_result`; orchestrator emits neutral check-run "Gate Auto-Fix: skipped (<reason>)" and `status="skipped"` for the four fixer branches that are cancellations or policy blocks (not real failures). reports.py + TUI updated so the new decision is excluded from success-rate denominator and rendered correctly. Fixes the spurious PR #340 `fix_failed` entry caused by cooldown blocking the polish loop. Explicit follow-ups in the PR description. Tests: +30 new (cancellation across all four callsites, fix_skipped per branch, post_review/approve_pr bot routing, quota 401 gating, misleading-log split). Rewrote 2 regression tests for the `subprocess.run` → `Popen` switch in `StructuredRunner`. pytest: 1274 passed (up from 1244). ruff: All checks passed. mypy: no new errors; pre-existing fixer_polish baseline preserved. Co-authored-by: Cursor <cursoragent@cursor.com>

openlawbot · 2026-05-13T17:51:23Z

Gate Review ✅

Approved — All six P1–P3 audit findings are correctly addressed and verified. Build is fully green (1268 passed, 6 pre-existing skips, ruff clean, no type errors). Architecture, security, and logic stages each returned zero findings. The logic reviewer explicitly verified all ten postconditions against the code: _is_own_pr bot-routing, approve_pr/post_review bot-comment bypass with escalation preserved, _auth_drift_marker_fresh/fail_open DEBUG-vs-WARNING gating, log_fix_result status→decision mapping, StructuredRunner.run Popen+watchdog correctness, run_with_retry Event.wait early-exit, _spawn_and_wait_agent _active_panes cleanup, and reports.summarize fix_skipped denominator exclusion. The Popen/watchdog threading refactor, quota suppression, TUI sibling-index fix, build.json guard, bot-PR routing, and fix_skipped vs fix_failed distinction are all behaviorally correct and test-covered. No actionable findings remain.

Build Results

ruff: ✅ (0 warnings)
pytest: ✅ (1268/1274 passed)

no issues across 6 stages (1097s, confidence: high)

openlawbot · 2026-05-13T17:51:28Z

Gate Review Summary

Verdict: ✅ Approved · confidence high · 1097s

All six P1–P3 audit findings are correctly addressed and verified. Build is fully green (1268 passed, 6 pre-existing skips, ruff clean, no type errors). Architecture, security, and logic stages each returned zero findings. The logic reviewer explicitly verified all ten postconditions against the code: _is_own_pr bot-routing, approve_pr/post_review bot-comment bypass with escalation preserved, _auth_drift_marker_fresh/fail_open DEBUG-vs-WARNING gating, log_fix_result status→decision mapping, StructuredRunner.run Popen+watchdog correctness, run_with_retry Event.wait early-exit, _spawn_and_wait_agent _active_panes cleanup, and reports.summarize fix_skipped denominator exclusion. The Popen/watchdog threading refactor, quota suppression, TUI sibling-index fix, build.json guard, bot-PR routing, and fix_skipped vs fix_failed distinction are all behaviorally correct and test-covered. No actionable findings remain.

Build

typecheck ✅ · lint ✅ · tests ✅

Stats

0 blocking · 0 other · 6 stages run

Next steps

This PR is approved by Gate.
A human reviewer should still sign off if your branch protection requires it.

How to override

You want to…	Add this label, or comment
Skip Gate entirely	`gate-skip` or `/gate skip <reason>`
Re-run the review	`gate-rerun` or `/gate rerun`
Bypass blocking findings (audited)	`gate-emergency-bypass` or `/gate bypass <incident-link>`
Disable auto-fix for this PR	`gate-no-fix`

_{Updated 2026-05-13 17:51 UTC · sticky comment, replaced in-place on each review cycle.}

openlawbot merged commit 3673996 into main May 13, 2026
5 checks passed

openlawbot deleted the fix/p1-p3-audit-fixes branch May 13, 2026 18:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gate): address P1-P3 audit findings (May 13 2026)#34

fix(gate): address P1-P3 audit findings (May 13 2026)#34
openlawbot merged 1 commit into
mainfrom
fix/p1-p3-audit-fixes

openlawbot commented May 13, 2026

Uh oh!

openlawbot commented May 13, 2026

Uh oh!

openlawbot commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

openlawbot commented May 13, 2026

Summary

What changed

P1.1 — TUI orphan log pane

P1.2 — Quota 401 noise

P2.1 — Cancellation plumbing

P2.2 — build.json race + misleading log message

P2.3 — Bot-PR routing

P3.1 — fix_skipped vs fix_failed

Verification

New tests by file

Explicit follow-ups (out of scope for this PR)

Test plan

Rollback

Uh oh!

openlawbot commented May 13, 2026

Gate Review ✅

Build Results

Uh oh!

openlawbot commented May 13, 2026

Gate Review Summary

Build

Stats

Next steps

How to override

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

P2.2 — `build.json` race + misleading log message

P3.1 — `fix_skipped` vs `fix_failed`