Skip to content

fix(sdlc): converge policy_decide to the legacy gate allow-set (root-cause the scope:denied tightenings)#3851

Merged
ryanklee merged 1 commit into
mainfrom
eta/reform-policy-decide-converge-20260601
Jun 1, 2026
Merged

fix(sdlc): converge policy_decide to the legacy gate allow-set (root-cause the scope:denied tightenings)#3851
ryanklee merged 1 commit into
mainfrom
eta/reform-policy-decide-converge-20260601

Conversation

@ryanklee
Copy link
Copy Markdown
Collaborator

@ryanklee ryanklee commented Jun 1, 2026

Reform fix — converge policy_decide to the legacy gate allow-set

The 3b-cutover gate requires an asymmetric shadow window (every legacy-vs-new divergence a LOOSENING, ZERO tightening). After #3828 drove the replayed tightenings to 0, the window accrued NEW decision shapes the repo-relative reduction did not cover, re-opening ~111 tightenings (90 scope:denied + 13 scope:command + 5 claim + 3 assignment). The cutover was blocked on correctness, not calendar: cutting over would newly block currently-allowed work. Each class is root-caused from the gate's REAL decision log (~/.cache/hapax/cc-task-gate-decisions.jsonl).

What converged (all strict relaxations — never a new block)

Scope normalization (the dominant class):

  • expanduser every ref before comparison (legacy-gate parity, cc-task-gate.impl.sh:838) — ~/projects/<repo>/… / ~/Documents/… refs now match the absolute target.
  • a target presents BOTH worktree-relative (<wt>/<rest>, so a sister-repo ref hapax-coord/src/x matches) AND repo-relative (<rest>) forms.
  • the worktree anchor is no longer ~/projects/-only — a …/scratch/<name>/ clone resolves repo-relative too (/projects/ still wins for an inner scratch/).

Cognition / bookkeeping the real gate allowed (exit 0):

  • general /tmp/ scratch + ~/.cache/hapax/relay/ receipts are cognition; the governance-sensitive cc-active-task-* claim files under ~/.cache/hapax/ STAY gated.
  • a session's OWN claimed cc-task note (<task_id>.md) is governance bookkeeping — allowed regardless of scope/assignment (and surviving a reconciler-unassign race); a DIFFERENT task's note stays fully gated.

Argument-aware write detection (scope:command):

  • a python payload is a source write only with a real write SINK (write-mode open(), .write_*, shutil/os mutators) and only when its target is not solely scratch/cognition — a bare read open(x) / string-building heredoc is not. The crude pre-strip is superseded at this site (it mangled the mode/target tokens the check reads); it lives on only in legacy_bash_scope_block.
  • cp/mv/install/mkdir writing solely to /tmp (etc.) is scratch, not in-tree source.

Non-destructive window restart + cutover receipt:

  • evaluate_shadow_clean(window_start=…) scopes the clean window to decisions at/after a stamped boundary (replay now carries each row's original decision_ts); the full log/ledger are PRESERVED as evidence.
  • restart_window stamps the boundary; build_cutover_receipt/write_cutover_receipt emit a durable receipt (span_days, asymmetric_ok, cutover_eligible, countdown_days). policy-decide-shadow-eval gains --window-start-file / --restart-window / --receipt.

Result (verified by replaying the real decision log)

  • 111 → 25 tightenings; the dominant scope:denied class (tilde / sister-repo / own-note / scratch-worktree) is eliminated.
  • The 25 residuals are genuinely-permissive legacy drift policy_decide CORRECTLY blocks (a roleless merge, an out-of-scope test edit, a cross-task note) — not a relaxation regression — and are excluded by the window restart so a fresh clean week accrues from the converged logic.
  • POLICY_DECIDE_FN_VERSION 0.1.0 → 0.2.0 re-enters the promotion ladder at shadow (permanent-canary discipline).

Acceptance criteria

  • ✅ Dominant scope:denied tightening class eliminated (verified by re-running the replay/eval).
  • ✅ Cutover-eligibility receipt records span_days + asymmetric_ok (+ eligibility/countdown); written by the eval.
  • policy-decide-shadow-replay.timer remains active and untouched.
  • ✅ Ruff / ruff format / pyright clean; 164 policy_decide + 344 floor/cc_task_gate/sdlc_invariants/shadow tests green.
  • ➡️ TIGHTENING=0 over the evaluated window is reached by stamping the real window boundary post-deploy (policy-decide-shadow-eval --restart-window --receipt) once source-activation ships this logic — the converged logic lands here first, as the task title requires ("…before restarting the 7-day clean shadow window").

Task: reform-policy-decide-converge-20260601 · AuthorityCase: CASE-SDLC-REFORM-001

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added window restart boundary functionality to reset policy evaluation periods.
    • Introduced --restart-window and --receipt options for policy-decide shadow evaluation.
  • Improvements

    • Enhanced scope path normalization for more accurate path matching.
    • Improved handling of temporary directories, cache paths, and governance task notes.
    • Refined filesystem write detection logic to reduce false positives.
    • Policy version bumped to 0.2.0.
  • Tests

    • Expanded test coverage for scope resolution, path handling, and edge case scenarios.

…cause the scope:denied tightenings) + window-restart receipt

The 3b-cutover gate requires an asymmetric shadow window (every legacy-vs-new
divergence a LOOSENING, ZERO tightening). After #3828 drove the replayed
tightenings to 0, the shadow window accrued NEW decision shapes the repo-relative
reduction does not cover, re-opening ~111 TIGHTENINGS (90 scope:denied + 13
scope:command + 5 claim + 3 assignment) and re-blocking cutover on CORRECTNESS,
not calendar. Root-caused each class from the gate's REAL decision log and
converged policy_decide so it is once again a strict relaxation of the legacy gate.

Scope normalization (the dominant class):
- expanduser every scope ref before comparison (legacy-gate parity,
  cc-task-gate.impl.sh:838) — a `~/projects/<repo>/…` or `~/Documents/…` ref now
  matches the absolute target instead of comparing the raw `~` literal.
- a target presents BOTH worktree-relative (`<wt>/<rest>`, so a sister-repo ref
  like `hapax-coord/src/x` from a cross-repo task whose cwd was `~/projects/`
  matches) AND repo-relative (`<rest>`) forms.
- the worktree anchor is no longer `~/projects/`-only: a `…/scratch/<name>/`
  clone resolves repo-relative too (`/projects/` still wins for an inner scratch dir).

Cognition / bookkeeping carve-outs the real gate allowed (exit 0):
- general `/tmp/` scratch (the master design's "bare /tmp", vs the old
  `/tmp/hapax-*` only) and `~/.cache/hapax/relay/` receipts are cognition; the
  governance-sensitive `cc-active-task-*` claim files under `~/.cache/hapax/` stay gated.
- a session's OWN claimed cc-task note (`<task_id>.md`) is governance bookkeeping
  (session log, stage, AC boxes) — allowed regardless of scope/assignment, the way
  the legacy content-validated bootstrap allows it (and surviving a reconciler
  -unassign race). A DIFFERENT task's note stays fully gated.

Argument-aware write detection (scope:command):
- a python payload is a source write only with a real write SINK (write-mode
  `open()`, `.write_*`, shutil/os mutators) — a bare read `open(x)` or a
  string-building heredoc is not — AND only when its target is not solely
  scratch/cognition. The crude pre-strip the legacy gate used at this site is
  superseded (it mangled the very mode/target tokens the check reads); the strip
  lives on only in `legacy_bash_scope_block`.
- cp/mv/install/mkdir writing solely to `/tmp` (etc.) is scratch, not in-tree source.

Non-destructive window restart + cutover receipt (so cutover is neither forgotten
nor forced on the calendar):
- `evaluate_shadow_clean(window_start=…)` scopes the clean window to decisions at/
  after a stamped boundary; the replay carries each row's original `decision_ts` so
  divergences window correctly. The full log/ledger are PRESERVED as evidence.
- `restart_window` stamps the boundary; `build_cutover_receipt`/`write_cutover_receipt`
  emit a durable receipt (span_days, asymmetric_ok, cutover_eligible, countdown_days).
  `policy-decide-shadow-eval` gains `--window-start-file` / `--restart-window` /
  `--receipt`. The replay timer (criterion 4) stays untouched and active.

Result: replaying the real decision log drops 111 → 25 tightenings; the dominant
scope:denied class (tilde / sister-repo / own-note / scratch-worktree) is eliminated.
The 25 residuals are genuinely-permissive legacy drift policy_decide CORRECTLY blocks
(roleless merge, an out-of-scope test edit, a cross-task note) — not a relaxation
regression — and are excluded by the window restart so a fresh clean week accrues from
the converged logic. POLICY_DECIDE_FN_VERSION 0.1.0 → 0.2.0 re-enters the promotion
ladder at shadow (permanent-canary discipline). Restarting the real window is a
post-deploy step (`--restart-window`) once source-activation ships this logic.

Tests: tilde/sister-repo/scratch parity, /tmp + relay cognition, own-task-note
bookkeeping (incl. unassign race + other-note-still-blocks), python write-vs-read +
cp-to-scratch (incl. fail-closed in-tree writers), window-start filter + decision_ts
+ receipt eligibility/countdown, and the eval `--restart-window`/`--receipt` script
path. 164 policy_decide tests + 344 floor/cc_task_gate/sdlc_invariants/shadow green;
ruff + ruff format + pyright clean.

Task: reform-policy-decide-converge-20260601
AuthorityCase: CASE-SDLC-REFORM-001

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR refactors policy-decide shadow evaluation to support window-scoped cutover gating with durable restart boundaries. It enhances bash write detection, extends cognition carve-outs, reworks scope normalization via workspace anchors, adds time-windowed evidence filtering, and consolidates the CLI entrypoint into a shared module function called by the script wrapper.

Changes

Shadow evaluation cutover gate

Layer / File(s) Summary
Bash write detection and argument-aware classification
shared/policy_decide.py
Bash/Python write detection now uses precise regexes to distinguish write sinks (open(..., 'w'...), Path.write_*, shutil.copy/move) from read-only operations, preventing false positives and enabling accurate scope mutation enforcement.
Scope normalization with workspace-aware anchors
shared/policy_decide.py
Reworked cwd-independent scope matching via /projects/ and /scratch/ workspace anchors and new _scope_forms() function. Scope comparison now normalizes paths consistently across replays. Modern shell scope enforcement uses raw command with new classifier.
Cognition carve-outs and own task note handling
shared/policy_decide.py
Extended cognition paths to allow relay receipt writes under ~/.cache/hapax/relay/ and /tmp/ scratch writes without claims. Added _is_own_task_note() permitting session's own governance task note writes (task_id.md) regardless of scope, while other tasks' notes remain blocked.
Window boundary and cutover receipt infrastructure
shared/policy_decide.py
Added durable window restart/reload via load_window_start() and restart_window(), plus build_cutover_receipt() and write_cutover_receipt() to construct and persist eligibility artifacts. Enables non-destructive window restarts and tracks cutover eligibility.
Shadow replay and evaluate windowing
shared/policy_decide.py
Enhanced replay to stamp decision_ts from original decision logs into ledger. Added optional window_start parameter to evaluate_shadow_clean for time-scoped evidence filtering, using decision_ts to exclude pre-boundary decisions and divergence checks.
CLI entrypoint and script wrapper consolidation
shared/policy_decide.py, scripts/policy-decide-shadow-eval
Added shadow_eval_main() CLI with --restart-window and optional --receipt flags, exiting 0 iff window is clean. Bumped POLICY_DECIDE_FN_VERSION to 0.2.0. Refactored script from self-contained evaluator to thin wrapper delegating to shadow_eval_main().
Test coverage for scope and carve-outs
tests/test_policy_decide.py
Regression tests for /scratch/ worktree scope resolution, tilde/sister-repo ref expansion, /tmp cognition without claims, relay receipt cognition, own task note bookkeeping across claims, read-only systemctl non-mutation, and scope:command convergence for scratch reads vs in-tree writes.
Test coverage for windows and receipts
tests/test_policy_decide_shadow_producer.py, tests/test_policy_decide_shadow_scripts.py
Unit tests for window-start filtering (pre-boundary tightening excluded), cutover_receipt fields and countdown, and script-level tests for --window-start-file determinism, --restart-window + --receipt behavior (fresh window ineligibility, generated_at presence).

Sequence Diagram(s)

sequenceDiagram
  participant Script as policy-decide-shadow-eval
  participant ShadowEvalMain as shadow_eval_main()
  participant WindowLoad as load_window_start()
  participant Replay as replay_decision_log()
  participant EvalClean as evaluate_shadow_clean()
  participant BuildReceipt as build_cutover_receipt()
  participant WriteReceipt as write_cutover_receipt()

  Script->>ShadowEvalMain: invoke with --restart-window/--receipt
  ShadowEvalMain->>WindowLoad: load or restart boundary
  WindowLoad-->>ShadowEvalMain: window_start datetime
  ShadowEvalMain->>Replay: replay ledger with decision_ts
  Replay-->>ShadowEvalMain: stamped ledger rows
  ShadowEvalMain->>EvalClean: evaluate with window_start filter
  EvalClean-->>ShadowEvalMain: verdict (clean/tightening/asymmetry)
  alt --receipt requested
    ShadowEvalMain->>BuildReceipt: build from verdict
    BuildReceipt-->>ShadowEvalMain: receipt dict
    ShadowEvalMain->>WriteReceipt: persist receipt
    WriteReceipt-->>ShadowEvalMain: ✓
  end
  ShadowEvalMain-->>Script: exit 0 (clean) or 1 (tightening/asymmetry)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • hapax-systems/hapax-council#3806: The main PR builds directly on #3806 by refactoring scripts/policy-decide-shadow-eval into a wrapper over shared.policy_decide.shadow_eval_main and extending shared/policy_decide.evaluate_shadow_clean with window-start/receipt logic.
  • hapax-systems/hapax-council#3828: The scope-matching refactor (workspace/repo-relative path normalization) and shell source-mutation scope handling build directly on the normalization and quote-strip logic addressed in #3828.
  • hapax-systems/hapax-council#3843: Both PRs update argument-aware bash classification logic inside shared/policy_decide.py for modern shell mutation/source-scope detection, making the classifier/scoping changes directly related.

Poem

🐰 A shadow-week takes form with durable grace,
Window boundaries carved in restart-time's space,
Cognition carve-outs blooming /tmp and relay/,
Own task notes sing freely—scoped checks fade away,
From script to shared module, the refactor's dance,
Receipts hold the verdict: cutover's true chance! 📜✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately and specifically describes the main change: converging policy_decide to the legacy gate allow-set and root-causing the scope:denied tightenings.
Description check ✅ Passed The PR description is comprehensive and addresses all template sections. It includes a detailed summary, identifies the AuthorityCase (CASE-SDLC-REFORM-001) and slice (reform-policy-decide-converge-20260601), provides a thorough test plan, and addresses CLAUDE.md hygiene requirements.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch eta/reform-policy-decide-converge-20260601

Comment @coderabbitai help to get the list of available commands and usage tips.

@ryanklee ryanklee added this pull request to the merge queue Jun 1, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6cdd31c877

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread shared/policy_decide.py
Comment on lines +287 to +288
if head in {"cp", "mv", "install"}:
return positionals[-1:] # the destination is the trailing positional
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle -t destinations before allowing scratch-only copies

When Bash uses GNU cp/mv/install with -t (for example cp -t shared /tmp/x), the destination directory is supplied by the option, but this code only inspects the trailing positional /tmp/x. I checked coreutils help, which says -t, --target-directory=DIRECTORY copies/moves all source arguments into that directory; because /tmp/x is treated as cognition, _bash_is_source_scope returns false and the policy allows an out-of-scope write into shared/ without a scope-verifiable path.

Useful? React with 👍 / 👎.

Comment thread shared/policy_decide.py
Comment on lines +277 to +279
literals = _PATH_LITERAL_RE.findall(command)
# A write whose every path literal is scratch/cognition is not an in-tree write.
return not (literals and all(is_cognition_path(p) for p in literals))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Treat Python write targets separately from string payloads

When a Python command writes a relative repo file while also containing a quoted scratch-looking string, e.g. Path('pyproject.toml').write_text('/tmp/x'), _PATH_LITERAL_RE captures only /tmp/x because the actual target has no slash. Since all captured literals are cognition paths, this returns false and the Bash call is classified as non-mutating, allowing an in-tree write to bypass the claim/scope gates entirely.

Useful? React with 👍 / 👎.

Comment thread shared/policy_decide.py
r"\.write_text\(|\.write_bytes\("
r"|\bshutil\.(?:copy|copy2|copyfile|copytree|move)\("
r"|\bos\.(?:remove|unlink|rename|replace|rmdir|removedirs|makedirs|mkdir)\("
r"|\bopen\([^)]*,\s*['\"][^'\"]*[wax+]"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Recognize keyword-mode open() writes

When Python uses the common open('pyproject.toml', mode='w') form, this regex no longer matches the write sink because it only accepts a positional mode immediately after the comma. _bash_is_mutating therefore treats that Bash command as non-mutating, so an in-tree file write can bypass the claim/scope gates entirely; the previous broad open( check did block this shape.

Useful? React with 👍 / 👎.

Comment thread shared/policy_decide.py
args.ledger,
min_days=args.min_days,
min_decisions=args.min_decisions,
window_start=window_start,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply the restart window to the auto-promoter

The restart boundary is only passed by shadow_eval_main; I checked systemd/units/policy-decide-promote.service, and the hourly production path is python -m shared.policy_decide promote --replay, whose run_promotion_cycle still calls evaluate_shadow_clean without loading DEFAULT_WINDOW_START. After an operator stamps --restart-window, the evaluator script shows a fresh window but the auto-promoter continues to evaluate the full historical ledger, so the pre-convergence tightenings can still keep the promotion ladder blocked.

Useful? React with 👍 / 👎.

Comment thread shared/policy_decide.py
#: decision's worktree). Reducing BOTH sides to repo-relative form yields the same
#: verdict the live gate returned in that worktree. ``/projects/`` is tried first so a
#: repo carrying its OWN inner ``scratch/`` dir still anchors on the workspace root.
_WORKTREE_ANCHORS = ("/projects/", "/scratch/")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restrict scratch normalization to real scratch worktrees

Because /scratch/ is accepted as an anchor anywhere in an absolute path, a path such as /var/scratch/not-a-worktree/shared/secret.py normalizes to shared/secret.py and matches a normal shared/ scope ref. In contexts where an Edit/Write target is outside the intended /data/cache/hapax/scratch/<name>/ clone root, this lets arbitrary scratch-named paths bypass the absolute-path scope check.

Useful? React with 👍 / 👎.

Comment thread shared/policy_decide.py
Comment on lines +417 to +419
# Ephemeral /tmp scratch (verify scripts, PR bodies, commit-message files) — the
# master design's "bare /tmp", broader than the legacy gate's /tmp/hapax-* only.
return path.startswith("/tmp/")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Normalize /tmp paths before treating them as cognition

Because the new broad /tmp/ carve-out checks only the raw string prefix, a path such as /tmp/../workspace/hapax-council/shared/policy_decide.py is allowed as cognition before claim, assignment, and scope checks, even though it resolves back into the worktree. Any Edit/Write path containing .. under /tmp can therefore bypass the source mutation gate unless the path is normalized before this check.

Useful? React with 👍 / 👎.

Comment thread shared/policy_decide.py
Comment on lines +436 to +438
if not any(root in p for root in _TASK_NOTE_ROOTS):
return False
return p.rsplit("/", 1)[-1] == f"{task_id}.md"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Constrain own-task-note matching to the vault root

This substring check allows any path containing /20-projects/hapax-cc-tasks/ or /20-projects/hapax-requests/ and named <task_id>.md to take the early own-task-note allow path. For example, an out-of-scope repo file under a similarly named directory would bypass assignment, status, authority, and scope checks; the carve-out should verify the path is actually under the intended vault root.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@shared/policy_decide.py`:
- Around line 282-299: _unconditional_targets currently assumes cp/mv/install
destination is the last positional, which misses -t/--target-directory forms;
update _unconditional_targets(head, tokens) to detect target flags for these
commands: look for short "-t" followed by a separate token or attached
"-tVALUE", and long "--target-directory" with either "=VALUE" or the next token,
and return that target (as a single-element list) when present; otherwise fall
back to the existing trailing-positional logic. Ensure handling covers both
attached and separated flag values and update any tests around
_unconditional_writes_in_tree/_unconditional_targets accordingly.
- Around line 426-438: The _is_own_task_note carve-out currently matches
_TASK_NOTE_ROOTS by substring which allows paths like /tmp/.../20-projects/...
to bypass checks; update _is_own_task_note to canonicalize both path and each
root (using os.path.abspath/os.path.realpath and os.path.expanduser), ensure
each root ends with a path separator, and then check that the canonical file
path startswith the canonical root before comparing the basename to
f"{task_id}.md" (reference _is_own_task_note and _TASK_NOTE_ROOTS to locate the
logic).
- Around line 256-279: The _PYTHON_WRITE_RE currently misses keyword-mode calls
like open(path, mode="w"); update the regex used by _PYTHON_WRITE_RE to also
match open\([^)]*\bmode\s*=\s*['"][^'"]*[wax+]['"] so keyword forms (with
optional spaces and either quote style, and combinations like "w+", "a", "x")
are detected as write sinks; ensure this change is reflected where
_PYTHON_WRITE_RE is used by _python_writes_in_tree (which relies on that regex
and _PATH_LITERAL_RE and is_cognition_path) so keyword-mode open(...) no longer
escapes the write-detection logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 8d9b4326-3a63-41b0-b56c-3b13e70953fc

📥 Commits

Reviewing files that changed from the base of the PR and between 671b815 and 6cdd31c.

📒 Files selected for processing (5)
  • scripts/policy-decide-shadow-eval
  • shared/policy_decide.py
  • tests/test_policy_decide.py
  • tests/test_policy_decide_shadow_producer.py
  • tests/test_policy_decide_shadow_scripts.py

Comment thread shared/policy_decide.py
Comment on lines +256 to +279
#: A genuine source WRITE inside a python payload: a write/append/create ``open()``
#: mode, a ``Path.write_*`` sink, a ``shutil`` copy/move, or an ``os`` path-mutating
#: call. A bare ``open(x)`` (read) or a string-building heredoc is NOT a source write
#: — the over-block the legacy substring ``open\(`` produced (it counted every open).
_PYTHON_WRITE_RE = re.compile(
r"\.write_text\(|\.write_bytes\("
r"|\bshutil\.(?:copy|copy2|copyfile|copytree|move)\("
r"|\bos\.(?:remove|unlink|rename|replace|rmdir|removedirs|makedirs|mkdir)\("
r"|\bopen\([^)]*,\s*['\"][^'\"]*[wax+]"
)
#: A quoted path-like literal (a no-whitespace token carrying a ``/``) — used to tell
#: whether a write targets ONLY ephemeral scratch/cognition (``/tmp``, vault, relay).
_PATH_LITERAL_RE = re.compile(r"""['"]([^'"\s]*/[^'"\s]*)['"]""")


def _python_writes_in_tree(command: str) -> bool:
"""A python payload is a scope-bound source write only when it has a write sink AND
that write is not solely to ephemeral scratch/cognition (``/tmp``, vault,
``~/.cache/hapax/relay``). A read-only or string-building payload is never one."""
if not _PYTHON_WRITE_RE.search(command):
return False
literals = _PATH_LITERAL_RE.findall(command)
# A write whose every path literal is scratch/cognition is not an in-tree write.
return not (literals and all(is_cognition_path(p) for p in literals))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Does Python's built-in open()support a keyword argument namedmode, e.g. open(path, mode="w")?

💡 Result:

Yes, Python's built-in open function supports a keyword argument named mode [1][2]. The function signature is defined as: open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None) [1][3] You can use it as a positional argument (e.g., open("file.txt", "w")) or as a keyword argument (e.g., open("file.txt", mode="w")) [1][2]. The mode argument is an optional string that specifies the mode in which the file is opened, with 'r' (reading in text mode) as the default value [1][3].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find where _python_writes_in_tree is defined and used
rg -n "_python_writes_in_tree\s*\(" -S shared/policy_decide.py
rg -n "_python_writes_in_tree" -S shared/policy_decide.py

# Show the decision code paths around the call sites (keep output bounded)
rg -n "python_writes_in_tree|_bash_is_mutating|_unconditional_writes_in_tree" -S shared/policy_decide.py

# Print relevant sections with line numbers for context
python3 - <<'PY'
import re, itertools, pathlib
path = pathlib.Path("shared/policy_decide.py")
text = path.read_text(encoding="utf-8")
lines = text.splitlines()
# collect windows around matches
patterns = [
    re.compile(r"_python_writes_in_tree\s*\("),
    re.compile(r"_bash_is_mutating\s*\("),
    re.compile(r"_unconditional_writes_in_tree\s*\("),
]
matches = []
for i,l in enumerate(lines, start=1):
    if any(p.search(l) for p in patterns):
        matches.append(i)
for m in sorted(set(matches)):
    start=max(1,m-20); end=min(len(lines),m+40)
    print("\n--- context around line", m, "---")
    for j in range(start,end+1):
        print(f"{j:4d}: {lines[j-1]}")
PY

Repository: hapax-systems/hapax-council

Length of output: 19062


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate where _is_gated_mutation/_bash_is_mutating is used in _decide
rg -n "_is_gated_mutation\(" shared/policy_decide.py
rg -n "def _decide\(" shared/policy_decide.py
rg -n "claim|authority|scope" shared/policy_decide.py

# Print the _decide implementation window around its definition and around _is_gated_mutation usage
python3 - <<'PY'
import pathlib, re
p = pathlib.Path("shared/policy_decide.py")
lines = p.read_text(encoding="utf-8").splitlines()
def dump(start, end):
    for i in range(start, min(end, len(lines))+1):
        print(f"{i:4d}: {lines[i-1]}")
# Find def _decide
for i,l in enumerate(lines, start=1):
    if re.match(r"def _decide\(", l):
        dec_i=i
        break
else:
    dec_i=None
print("def _decide line:", dec_i)
dump(dec_i-20, dec_i+140)

# Find usages of _is_gated_mutation inside the file and dump a bit around each
for i,l in enumerate(lines, start=1):
    if "_is_gated_mutation(" in l:
        print("\n-- context around _is_gated_mutation call at line", i, "--")
        dump(i-25, i+60)
PY

Repository: hapax-systems/hapax-council

Length of output: 22409


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "_BASH_TOOLS|_EDIT_TOOLS" shared/policy_decide.py
rg -n "tool_call\.tool_name|ToolCall" shared/policy_decide.py
rg -n "class ToolCall|NamedTuple|dataclass|TypedDict" -S shared/policy_decide.py

Repository: hapax-systems/hapax-council

Length of output: 967


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python_file="shared/policy_decide.py"

# Show _EDIT_TOOLS and _BASH_TOOLS definitions
sed -n '60,120p' "$python_file" | cat -n

# Show ToolCall dataclass
sed -n '40,110p' "$python_file" | cat -n

# Show ToolCall creation from args (CLI) and from any earlier parsing
rg -n "ToolCall\(" "$python_file"
sed -n '1560,1605p' "$python_file" | cat -n

# Show how tool_call fields are derived (search for args.tool_name etc.)
rg -n "args\.tool_name|args\.command|args\.file_path" "$python_file"

Repository: hapax-systems/hapax-council

Length of output: 8549


Count open(..., mode="w") as a write sink in _python_writes_in_tree.

shared/policy_decide.py’s _PYTHON_WRITE_RE only matches positional open(..., 'w')-style mode arguments, so keyword mode= forms (e.g. open(path, mode="w")) aren’t detected as writes; the command then falls through as “non-mutating” and skips the gated claim/authority/scope checks.

Suggested fix
 _PYTHON_WRITE_RE = re.compile(
     r"\.write_text\(|\.write_bytes\("
     r"|\bshutil\.(?:copy|copy2|copyfile|copytree|move)\("
     r"|\bos\.(?:remove|unlink|rename|replace|rmdir|removedirs|makedirs|mkdir)\("
-    r"|\bopen\([^)]*,\s*['\"][^'\"]*[wax+]"
+    r"|\bopen\([^)]*,\s*['\"][^'\"]*[wax+]"
+    r"|\bopen\([^)]*\bmode\s*=\s*['\"][^'\"]*[wax+]"
 )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@shared/policy_decide.py` around lines 256 - 279, The _PYTHON_WRITE_RE
currently misses keyword-mode calls like open(path, mode="w"); update the regex
used by _PYTHON_WRITE_RE to also match
open\([^)]*\bmode\s*=\s*['"][^'"]*[wax+]['"] so keyword forms (with optional
spaces and either quote style, and combinations like "w+", "a", "x") are
detected as write sinks; ensure this change is reflected where _PYTHON_WRITE_RE
is used by _python_writes_in_tree (which relies on that regex and
_PATH_LITERAL_RE and is_cognition_path) so keyword-mode open(...) no longer
escapes the write-detection logic.

Comment thread shared/policy_decide.py
Comment on lines +282 to +299
def _unconditional_targets(head: str, tokens: list[str]) -> list[str]:
"""Best-effort filesystem targets an unconditional-source command writes."""
positionals = [t for t in tokens[1:] if not t.startswith("-")]
if not positionals:
return []
if head in {"cp", "mv", "install"}:
return positionals[-1:] # the destination is the trailing positional
return positionals # tee/touch/truncate/chmod/chown/mkdir/rm/dd: every positional


def _unconditional_writes_in_tree(head: str, tokens: list[str]) -> bool:
"""An unconditional-source command is a scope-bound source write only when SOME
target is NOT ephemeral scratch/cognition; writing solely to ``/tmp`` (etc.) is
not an in-tree source mutation (legacy-gate recorded-allow parity)."""
targets = _unconditional_targets(head, tokens)
if not targets:
return True # cannot see a target → fail safe (scope-bound)
return any(not is_cognition_path(t) for t in targets)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle -t/--target-directory destinations when classifying scratch-only writes.

Line 287 assumes the destination for cp/mv/install is always the trailing positional. For commands like cp -t /tmp foo, mv --target-directory=/tmp bar, or install -t /tmp baz, _unconditional_targets() returns the source path instead, so _unconditional_writes_in_tree() reintroduces scope:command tightenings for writes that are actually scratch-only.

Suggested fix
 def _unconditional_targets(head: str, tokens: list[str]) -> list[str]:
     """Best-effort filesystem targets an unconditional-source command writes."""
+    if head in {"cp", "mv", "install"}:
+        for i, token in enumerate(tokens[1:], start=1):
+            if token in {"-t", "--target-directory"} and i + 1 < len(tokens):
+                return [tokens[i + 1]]
+            if token.startswith("--target-directory="):
+                return [token.split("=", 1)[1]]
     positionals = [t for t in tokens[1:] if not t.startswith("-")]
     if not positionals:
         return []
     if head in {"cp", "mv", "install"}:
         return positionals[-1:]  # the destination is the trailing positional
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@shared/policy_decide.py` around lines 282 - 299, _unconditional_targets
currently assumes cp/mv/install destination is the last positional, which misses
-t/--target-directory forms; update _unconditional_targets(head, tokens) to
detect target flags for these commands: look for short "-t" followed by a
separate token or attached "-tVALUE", and long "--target-directory" with either
"=VALUE" or the next token, and return that target (as a single-element list)
when present; otherwise fall back to the existing trailing-positional logic.
Ensure handling covers both attached and separated flag values and update any
tests around _unconditional_writes_in_tree/_unconditional_targets accordingly.

Comment thread shared/policy_decide.py
Comment on lines +426 to +438
def _is_own_task_note(path: str, task_id: str) -> bool:
"""True iff ``path`` is the governance note for THIS claimed task (``<task_id>.md``).

The note basename must be exactly ``<task_id>.md`` under the cc-task/request vault
roots — so a session may keep its own note (session log, stage, AC boxes) but not
forge or edit another task's note through this carve-out.
"""
if not task_id or not path:
return False
p = os.path.expanduser(path)
if not any(root in p for root in _TASK_NOTE_ROOTS):
return False
return p.rsplit("/", 1)[-1] == f"{task_id}.md"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Anchor the own-task-note carve-out to the real vault roots.

Lines 436-438 only check whether the path contains /20-projects/hapax-cc-tasks/ or /20-projects/hapax-requests/. Because Line 651 turns this helper into an unconditional allow, a path like /tmp/x/20-projects/hapax-cc-tasks/<task_id>.md or a repo path with the same segment bypasses assignment and scope even though it is not the governance vault.

Suggested fix
 def _is_own_task_note(path: str, task_id: str) -> bool:
@@
     if not task_id or not path:
         return False
     p = os.path.expanduser(path)
-    if not any(root in p for root in _TASK_NOTE_ROOTS):
+    personal = os.path.expanduser("~/Documents/Personal")
+    allowed_roots = tuple(personal + root for root in _TASK_NOTE_ROOTS)
+    if not any(p.startswith(root) for root in allowed_roots):
         return False
     return p.rsplit("/", 1)[-1] == f"{task_id}.md"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@shared/policy_decide.py` around lines 426 - 438, The _is_own_task_note
carve-out currently matches _TASK_NOTE_ROOTS by substring which allows paths
like /tmp/.../20-projects/... to bypass checks; update _is_own_task_note to
canonicalize both path and each root (using os.path.abspath/os.path.realpath and
os.path.expanduser), ensure each root ends with a path separator, and then check
that the canonical file path startswith the canonical root before comparing the
basename to f"{task_id}.md" (reference _is_own_task_note and _TASK_NOTE_ROOTS to
locate the logic).

Merged via the queue into main with commit 5c53df7 Jun 1, 2026
34 checks passed
@ryanklee ryanklee deleted the eta/reform-policy-decide-converge-20260601 branch June 1, 2026 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant