fix(chore-style): forbid create_issue in autofix mode; agent must open PR#39
fix(chore-style): forbid create_issue in autofix mode; agent must open PR#39norrietaylor wants to merge 1 commit into
Conversation
…n PR Closes #30. In autofix mode the Copilot agent was calling create_issue instead of create_pull_request, producing a duplicate of any open report-mode issue, never opening the autofix PR, and never closing the source issue. Reproduced on gominimal/spectacles-test run 26210555486 (chore-style-python autofix on an F403 violation). Root cause: the safe-outputs allowlist correctly includes both create-issue and create-pull-request so the chore can run in either mode, but the prose never tells the agent which one to call per mode. The agent fell back to create_issue. Fix: add a "Mode → safe-output contract (READ FIRST)" section to each chore-style-* workflow that names the required tool per mode and explicitly forbids the wrong tool, plus a matching "What you must not do" entry. The contract names both the YAML allowlist key (`create- pull-request`) and the runtime tool name (`create_pull_request`). Allowlist-level gating (templated `max:` expression that resolves to 0 in the wrong mode) was attempted but is not currently supportable in gh-aw: every ternary form (`cond && '0' || '1'`, `fromJSON('{"k":"v"}') [inputs.x]`) renders into the safe-outputs JSON env var with characters actionlint cannot lex (`&&` → `&&`; nested `\"` inside `fromJSON()`). The YAML frontmatter now carries a comment explaining this so a future contributor does not re-attempt the same approach. Applied identically to chore-style-{python,toml,rust,go,ncl} — they all share the report/autofix mode template. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughSummary by CodeRabbit
WalkthroughFive chore-style GitHub Actions workflows (Go, Nickel, Python, Rust, TOML) are updated to enforce mode-dependent safe-output tool selection. Each workflow adds explicit contracts and agent prompt rules preventing tools from being called incorrectly based on ChangesMode-dependent safe-output tool contract enforcement
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
workflows/chore-style-toml.md (1)
1-177: 💤 Low valueFile duplication: consider consolidation or automation.
This file appears to be a duplicate of
.github/workflows/chore-style-toml.md. Maintaining two identical source files increases the risk of drift if one is updated without the other. If both files serve different purposes in the workflow generation pipeline, consider either:
- Adding a comment explaining why duplication is necessary, or
- Automating synchronization (e.g., symlink, copy step in build), or
- Using one as the canonical source and generating the other
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@workflows/chore-style-toml.md` around lines 1 - 177, This workflow is duplicated—consolidate the duplicate copies of the "Style chore: TOML" workflow by choosing one canonical source (the existing chore-style-toml workflow that contains the Mode/safe-output contract and steps such as taplo fmt/ lint and create-pull-request/create-issue logic) and remove the redundant file OR implement an explicit synchronization strategy (a comment explaining why duplication is required, a repo-gen step, or a git-symlink/copy automation that keeps both files identical); ensure any references to the workflow (e.g., the Mode → safe-output contract, the taplo fmt/ taplo lint steps and the create-pull-request/create-issue safe-output rules) are updated to point to the canonical copy and add a brief header comment in the retained file documenting the canonical source and rationale for the chosen approach.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/chore-style-go.md:
- Around line 43-53: Add a runtime guard that validates inputs.mode against the
chosen side-effect tool before safe-output processing: inspect inputs.mode and
the parsed tool invocation (create_issue vs create_pull_request) and if they
mismatch, abort processing with a non-zero exit (or throw) so the workflow fails
closed; implement this check in the code path that handles safe-output
parsing/dispatch (the module/function that currently processes safe-output) and
ensure the error message clearly references inputs.mode and the invoked tool so
the mismatch is visible in logs.
In @.github/workflows/chore-style-ncl.md:
- Around line 42-52: Add a runtime validation gate in the workflow execution
path that rejects or errors when agent outputs violate the mode/tool contract:
when inputs.mode == "autofix" ensure no outputs named create_issue or
update_issue are passed to safe-output handling, and when inputs.mode ==
"report" ensure create_pull_request is not emitted; implement this check
immediately before safe-output serialization/allowlist processing so any
forbidden outputs cause the job to fail with a clear message referencing the
invalid symbol (create_issue, update_issue, create_pull_request) and include
inputs.mode in the error context.
---
Nitpick comments:
In `@workflows/chore-style-toml.md`:
- Around line 1-177: This workflow is duplicated—consolidate the duplicate
copies of the "Style chore: TOML" workflow by choosing one canonical source (the
existing chore-style-toml workflow that contains the Mode/safe-output contract
and steps such as taplo fmt/ lint and create-pull-request/create-issue logic)
and remove the redundant file OR implement an explicit synchronization strategy
(a comment explaining why duplication is required, a repo-gen step, or a
git-symlink/copy automation that keeps both files identical); ensure any
references to the workflow (e.g., the Mode → safe-output contract, the taplo
fmt/ taplo lint steps and the create-pull-request/create-issue safe-output
rules) are updated to point to the canonical copy and add a brief header comment
in the retained file documenting the canonical source and rationale for the
chosen approach.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0708b858-978d-428b-93e5-7e241ccc90a1
📒 Files selected for processing (15)
.github/workflows/chore-style-go.lock.yml.github/workflows/chore-style-go.md.github/workflows/chore-style-ncl.lock.yml.github/workflows/chore-style-ncl.md.github/workflows/chore-style-python.lock.yml.github/workflows/chore-style-python.md.github/workflows/chore-style-rust.lock.yml.github/workflows/chore-style-rust.md.github/workflows/chore-style-toml.lock.yml.github/workflows/chore-style-toml.mdworkflows/chore-style-go.mdworkflows/chore-style-ncl.mdworkflows/chore-style-python.mdworkflows/chore-style-rust.mdworkflows/chore-style-toml.md
| # Note (issue #30): both create-issue and create-pull-request are listed | ||
| # so the agent has the right tool available in either mode. The agent | ||
| # MUST select the tool that matches inputs.mode — see the prose contract | ||
| # below. Gating via templated `max` was attempted but rejected: every | ||
| # available gh-aw expression form for a ternary (`cond && '0' || '1'`, | ||
| # `fromJSON('{"k":"v"}')[inputs.x]`) renders into the safe-outputs JSON | ||
| # env var with characters that actionlint cannot lex (`&&` → `&&`; | ||
| # nested `\"` inside `fromJSON()`). The prose contract is the primary | ||
| # defense; if the agent calls create_issue in autofix anyway, the | ||
| # safe-output allowlist will still create the issue but the chore | ||
| # behaviour is wrong and visible from the run log. |
There was a problem hiding this comment.
Add runtime enforcement for mode/tool mismatch, not just prompt prose
This block confirms the system still permits create_issue side effects in autofix when the agent miscalls the tool. That means the #30 failure mode can recur under model drift. Please add an execution-time guard (before safe-output processing) that rejects mismatched tool types by inputs.mode so wrong outputs fail closed instead of being applied.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/chore-style-go.md around lines 43 - 53, Add a runtime
guard that validates inputs.mode against the chosen side-effect tool before
safe-output processing: inspect inputs.mode and the parsed tool invocation
(create_issue vs create_pull_request) and if they mismatch, abort processing
with a non-zero exit (or throw) so the workflow fails closed; implement this
check in the code path that handles safe-output parsing/dispatch (the
module/function that currently processes safe-output) and ensure the error
message clearly references inputs.mode and the invoked tool so the mismatch is
visible in logs.
There was a problem hiding this comment.
Acknowledged — the model-drift concern is real. Runtime enforcement was investigated in the original PR work and is blocked upstream in gh-aw today; the rationale is captured both in the Why no max: gating section of the PR body and in the inline YAML comment in safe-outputs: in every chore-style-* source file.
Short version: every supportable gh-aw expression form for a templated max (the natural gating point) renders into the safe-outputs JSON env var with characters that actionlint refuses (&& → &&; nested \" inside fromJSON() breaks the expression lexer). gh-aw's safe_outputs_config_generation.go uses plain json.Marshal (HTML-escaping on) instead of the project's own MarshalCompactNoHTMLEscape. Fixing this is a 1-line gh-aw upstream change and out of scope for this PR.
Tracking the upstream side and a complementary fixture-based CI defense in a new issue rather than blocking this PR. The current prose+forbidden-list contract is the strongest defense available without an upstream change; adding a runtime parse-and-fail step inside the chore lock would require either hand-editing the generated lock files (against repo conventions — see CLAUDE.md) or duplicating the chore logic into a wrapper-side validator, both of which trade more surface area for marginal extra safety.
There was a problem hiding this comment.
@norrietaylor, thanks for the thorough write-up — the constraint is clear and well-documented. The root cause (json.Marshal HTML-escaping in safe_outputs_config_generation.go producing && that actionlint can't lex) is a genuine upstream blocker, and the prose-contract approach is the right call given that hand-editing generated lock files is out per CLAUDE.md. I'll withdraw the original concern for this PR.
Would you like me to open the tracking issue for the two follow-up items you mentioned?
- Upstream gh-aw fix — switch
safe_outputs_config_generation.gofromjson.Marshalto the project'sMarshalCompactNoHTMLEscapesomax:can be gated with a templated expression. - Fixture-based CI defense — a workflow test that asserts
create_issueis never invoked wheninputs.mode == autofix(and vice versa), catching model-drift regressions without requiring the upstream change.
Just say the word and I'll create it and link it back to this PR and #30.
✏️ Learnings added
Learnt from: norrietaylor
Repo: norrietaylor/ch-oracles PR: 39
File: .github/workflows/chore-style-go.md:43-53
Timestamp: 2026-05-21T14:40:24.043Z
Learning: In the ch-oracles repo (norrietaylor/ch-oracles), runtime gating of safe-output tool calls by `inputs.mode` (e.g. blocking `create_issue` in autofix mode via a templated `max:` expression in gh-aw safe-outputs config) is currently blocked because gh-aw's `safe_outputs_config_generation.go` uses `json.Marshal` (HTML-escaping enabled) instead of the project's `MarshalCompactNoHTMLEscape`. This causes actionlint to reject the rendered env var due to `&&` (from `&&`) and nested escaped quotes. The fix is a 1-line upstream change to gh-aw; until then, the prose contract + explicit forbidden-list is the strongest available defense. Hand-editing generated lock files is prohibited per CLAUDE.md.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| # Note (issue #30): both create-issue and create-pull-request are listed | ||
| # so the agent has the right tool available in either mode. The agent | ||
| # MUST select the tool that matches inputs.mode — see the prose contract | ||
| # below. Gating via templated `max` was attempted but rejected: every | ||
| # available gh-aw expression form for a ternary (`cond && '0' || '1'`, | ||
| # `fromJSON('{"k":"v"}')[inputs.x]`) renders into the safe-outputs JSON | ||
| # env var with characters that actionlint cannot lex (`&&` → `&&`; | ||
| # nested `\"` inside `fromJSON()`). The prose contract is the primary | ||
| # defense; if the agent calls create_issue in autofix anyway, the | ||
| # safe-output allowlist will still create the issue but the chore | ||
| # behaviour is wrong and visible from the run log. |
There was a problem hiding this comment.
Enforce mode/tool contract in workflow execution path
This note documents that create_issue can still execute in autofix if the agent violates instructions. Relying only on prompt compliance leaves issue #30 as a latent regression risk. Add a runtime validation gate that blocks/errs on create_issue/update_issue outputs in autofix (and create_pull_request in report) before safe-output handling.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/chore-style-ncl.md around lines 42 - 52, Add a runtime
validation gate in the workflow execution path that rejects or errors when agent
outputs violate the mode/tool contract: when inputs.mode == "autofix" ensure no
outputs named create_issue or update_issue are passed to safe-output handling,
and when inputs.mode == "report" ensure create_pull_request is not emitted;
implement this check immediately before safe-output serialization/allowlist
processing so any forbidden outputs cause the job to fail with a clear message
referencing the invalid symbol (create_issue, update_issue, create_pull_request)
and include inputs.mode in the error context.
There was a problem hiding this comment.
Acknowledged — the model-drift concern is real. Runtime enforcement was investigated in the original PR work and is blocked upstream in gh-aw today; the rationale is captured both in the Why no max: gating section of the PR body and in the inline YAML comment in safe-outputs: in every chore-style-* source file.
Short version: every supportable gh-aw expression form for a templated max (the natural gating point) renders into the safe-outputs JSON env var with characters that actionlint refuses (&& → &&; nested \" inside fromJSON() breaks the expression lexer). gh-aw's safe_outputs_config_generation.go uses plain json.Marshal (HTML-escaping on) instead of the project's own MarshalCompactNoHTMLEscape. Fixing this is a 1-line gh-aw upstream change and out of scope for this PR.
Tracking the upstream side and a complementary fixture-based CI defense in a new issue rather than blocking this PR. The current prose+forbidden-list contract is the strongest defense available without an upstream change; adding a runtime parse-and-fail step inside the chore lock would require either hand-editing the generated lock files (against repo conventions — see CLAUDE.md) or duplicating the chore logic into a wrapper-side validator, both of which trade more surface area for marginal extra safety.
There was a problem hiding this comment.
@norrietaylor, understood — that's a thorough and technically sound explanation. The json.Marshal-with-HTML-escaping path in safe_outputs_config_generation.go is a legitimate upstream blocker, and the alternatives (hand-editing generated lock files, wrapper-side validator) both carry real costs that outweigh the marginal safety gain at this stage.
The prose contract + forbidden-list is the right call given the current gh-aw constraint. Happy to open the tracking issue for the upstream fix and fixture-based CI defense if you'd like — just say the word.
✏️ Learnings added
Learnt from: norrietaylor
Repo: norrietaylor/ch-oracles PR: 39
File: .github/workflows/chore-style-ncl.md:42-52
Timestamp: 2026-05-21T14:40:20.296Z
Learning: In the ch-oracles repo (norrietaylor/ch-oracles), runtime gating of safe-output tool selection by `inputs.mode` via templated `max:` expressions in gh-aw workflows is blocked upstream: `safe_outputs_config_generation.go` uses `json.Marshal` (HTML-escaping enabled) instead of `MarshalCompactNoHTMLEscape`, causing `&&` to render as `&&` and nested `\"` inside `fromJSON()` to break actionlint's expression lexer. The fix is a 1-line upstream gh-aw change. Until then, the prose/forbidden-list contract in the workflow prompt is the strongest available defense for enforcing the mode → safe-output tool mapping.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Summary
Closes #30. In autofix mode the Copilot agent was calling
create_issueinstead ofcreate_pull_request, producing a duplicate of any open report-mode issue, never opening the fix PR, and never closing the source issue (reproduced ongominimal/spectacles-testrun26210555486against an F403 violation).create-issueandcreate-pull-requestso the chore can run in either mode, but the prose never bound a specific tool to a specific mode. The agent fell back tocreate_issue.chore-style-*workflow now opens with a "Mode → safe-output contract (READ FIRST)" section that names the required tool per mode, explicitly forbids the wrong tool (including as a fallback after a failed verification gate), and is reinforced by entries in "What you must not do".chore-style-{python,toml,rust,go,ncl}— they all share the report/autofix mode template.Why no
max:gatingAllowlist-level gating (templated
max:expression that resolves to0for the wrong mode) was attempted but is not supportable in gh-aw today. Every available ternary form (cond && '0' || '1',fromJSON('{"k":"v"}')[inputs.x]) renders into the safe-outputs JSON env var with characters that actionlint cannot lex (&&becomes&&; nested\"insidefromJSON()breaks the expression lexer). The YAML frontmatter now carries an inline comment that explains this so a future contributor does not re-attempt the same path.Test plan
gh aw compile .github/workflows/chore-style-{python,toml,rust,go,ncl}.mdsucceeds (0 errors, 0 warnings).actionlint -colorpasses on every regenerated lock file.npx markdownlint-cli2 "workflows/chore-style-*.md"passes.for src in workflows/chore-style-*.md; do diff -q "$src" ".github/workflows/$(basename "$src")"; doneis clean (dogfood-md-sync invariant).scripts/test-safe-output-allowlists.pypasses (prose still namescreate-pull-requestso the allowlist entry is not flagged dead).scripts/test-chore-consistency.pypasses (14 chores consistent).scripts/audit-wrapper-permissions.py wrappers/*.ymlpasses (wrappers untouched).chore-style-pythoninmode=autofixongominimal/spectacles-testagainst the same F403 finding and confirm the agent now emitscreate_pull_request(notcreate_issue).🤖 Generated with Claude Code