fix(qa): #1036 — structural_completeness WARN (not FATAL-cap) for authored campaigns#1037
Conversation
…red campaigns (#1036) ROOT CAUSE The `structural_completeness` behavioral gate (qa/assert_behavioral.py) FATAL-capped AUTHORED golden-spine runs to 2.5. Sub-check (b) `unresolved_arc` fires when an active quest reaches session end open across a >=2-location arc with no quest-resolution call. But the campaign-arc quest is SEEDED from the authored adventure `hook` and is multi- session by design; the authored adventures (e.g. embergloom-pact) author NO closable sub-quests, so the DM legitimately never calls complete_quest / set_quest_status — and (b) FATAL-REDs even a clean 25-beat authored run. A self-inflicted false-cap. Sibling #1030 fixed party_traveled / combat_not_left_active the same way but missed this one. FIX (Option A scope guard — mirrors #1030's WARN-vs-FATAL discipline EXACTLY) - Compute `is_authored_campaign = bool(tools.get("start_adventure") or state.get("scenes"))`. `start_adventure` is the authored cold-open call (server.py:697), always in the tool stream `_tally` sees; `state["scenes"]` is non-empty only for seeded authored adventures (server.py serializes it; content.py persists authored scenes). - Demote ONLY sub-check (b) unresolved_arc from FATAL->WARN when authored AND the only open quest is the hook-seeded arc. The gate still APPENDS the WARN message (visibility kept); the run is no longer RED-capped on (b) alone. - Clause (a) approval-frozen stays FATAL ALWAYS. - PRESERVE FATAL for: any NON-authored run (the original narrated-not-engaged failure), AND an authored run that called add_quest (server.py:10165 — the DM's own quest-creation tool, distinguishable from the hook-seeded quest at gate time) and left it unresolved — a genuine dropped thread. New severity: `_unresolved_fatal = unresolved_arc and (not is_authored_campaign or bool(tools.get("add_quest")))`; `_structural_fatal = approval_frozen_run or _unresolved_fatal`. ANTI-SCORE-GAMING DUAL CORPUS PROOF - NEW GREEN fixture qa/gate_corpus/cases/structural_completeness_authored_warn/ (built by builder.py `case_structural_completeness_authored_warn`, recorded under a new RED-only- safe `green_cases` manifest key): authored profile (start_adventure + scenes + frozen companion + active hook quest + 2 locations + no resolution) -> gate exits GREEN with structural_completeness as [WARN]. Locked by test_behavioral_gate_corpus.py ::test_green_case_warns_but_stays_green (the inverse guard: re-promoting (b) to FATAL flips it RED and fails). - The EXISTING non-authored qa/gate_corpus/cases/structural_completeness/ fixture (no start_adventure, no scenes) regenerated cleanly and STILL exits FATAL RED. - 4 unit tests in qa/test_assert_behavioral.py: authored->WARN/GREEN, non-authored->FATAL/ RED, authored+add_quest->FATAL (carve-out), authored-via-scenes->WARN. - Coverage audit (test_manifest_covers_every_fatal_check) stays green — the gate still classifies structural_completeness as FATAL (fatal=<var>, not fatal=False), no drift. - BEHAVIORAL_GATE_TAXONOMY.json hint updated to document the #1036 authored-WARN behavior. Additive, no existing guard weakened. 78 focused tests pass single-process (no xdist).
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThe Changesstructural_completeness Authored-Campaign FATAL→WARN Guard
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 50e4c6c681
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| # Severity: clause (a) approval-frozen stays FATAL ALWAYS. Clause (b) unresolved_arc is | ||
| # FATAL unless it's an authored campaign whose ONLY open quest is the hook-seeded arc. | ||
| _unresolved_fatal = unresolved_arc and (not is_authored_campaign or dm_added_quest) |
There was a problem hiding this comment.
Restrict authored demotion to the actual seeded quest
For resumed authored campaigns, state["scenes"] remains true but the current transcript may not include the earlier add_quest call that created an active side quest. In that case dm_added_quest is false, so this line demotes unresolved_arc to a WARN for every active quest in the state, even when the open quest was a DM-created thread from a prior session rather than the hook-seeded campaign arc. That creates a false-green on the exact dropped-thread scenario the comment says should remain fatal; the demotion needs to verify the open quest is the single seeded hook (or otherwise track quest provenance), not just rely on this run’s tool counts.
Useful? React with 👍 / 👎.
Closes #1036. The
structural_completenessgate FATAL-capped authored golden-spine runs to 2.5: the campaign-arc quest (seeded from the adventurehook) is multi-session by design and authored adventures author no closable sub-quests, so a 25-beat run still REDs — a self-inflicted false-cap, the one sibling #1030 didn't reach.Option A scope guard (Option B "DM under-drives sub-quests" isn't viable until content authors per-act sub-quests):
is_authored_campaign = bool(tools.get("start_adventure") or state.get("scenes")); theunresolved_arcsub-check (b) demotes FATAL→WARN when authored and only the hook-seeded arc is open. Preserved FATAL: any non-authored run, AND an authored run with DM-added sub-quests (add_quest) left unresolved — genuine dropped threads. The approval-frozen sub-check (a) stays FATAL always.Honesty proof (not score-gaming): new GREEN corpus fixture (authored → WARN/GREEN); the existing non-authored fixture stays FATAL RED; +4 unit tests (authored→WARN, non-authored→FATAL, add_quest→FATAL, scenes-signal→WARN) + a corpus green-case guard. 78 focused tests pass. Mirrors #1030's WARN-vs-FATAL discipline.
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests