feat(auto): safe-default closure mode + partial-unsafe blocker code (PR-B2) by shaun0927 · Pull Request #1167 · Q00/ouroboros

shaun0927 · 2026-05-22T01:49:14Z

Summary

PR-B2 of the L4 Auto Envelope v2 freeze (#1157, #821, #1138).

Closes two gaps observed in live ooo auto runs against the #821 acceptance matrix where the safe-default policy could have produced a clean closure but the result envelope did not reflect what happened:

The existing safe-default success path closed the interview but never set state.interview_closure_mode. Callers could not distinguish mutual_agreement (default None) from a safe-default-applied closure. PR-B2 tags the envelope with interview_closure_mode = "safe_default" on that path.
The partial safe-default case — some required gaps were safely defaultable but at least one remained unsafe (e.g. CONFLICTING ledger entry, per-section unsafe-context flag) — used to fall through to the generic interview_max_rounds_exhausted blocker with no structured event and no rollback. PR-B2 routes it to a dedicated event auto.interview.safe_default_partial_unsafe_gaps and a new typed code interview_unsafe_gaps_remain. Partial defaults are rolled back via the existing _revert_safe_default_entries helper because synthesis was never pushed to the backend transcript — same invariant as the synthesis-failure rollback already in place.

The genuine-deadlock case (nothing defaultable at all) is unchanged — it continues to emit interview_max_rounds_exhausted.

Why this scope

This is the smallest first slice that closes the documented PR-B2 gap from the #1157 living SSOT and pushes the L4 Envelope v2 lane from 🟡 partial to closer-to-🟢-complete. After this lands, closure_mode taxonomy has four values (None / ledger_only / safe_default; blockers do not carry it), and stop_reason_code taxonomy has eight codes (3 interview-layer + 5 Ralph-layer).

What is NOT done here

No schema change, no manifest change, no new envelope field. Pure behavior + envelope-tag refinement on fields shipped by fix(auto): close interview on ledger-only consensus at max_rounds #1148 (PR-B1, interview_closure_mode) and feat(auto): canonical stop_reason_code for interview-layer blockers #1151 (PR-E, stop_reason_code).
No change to finalize_safe_defaultable_gaps or the safe-default policy itself — only the driver-level routing of its outcome.
PR-C2 (assumptions[].source provenance promotion) remains a separate follow-up.

Scope

src/ouroboros/auto/interview_driver.py
- Tag safe_default closure_mode on the existing success branch.
- Insert the new partial-unsafe branch with rollback + typed code, ahead of the existing mutual_agreement_deadlock_at_max_rounds event emission.
skills/auto/SKILL.md
- Extend the canonical stop_reason_code taxonomy table to 8 codes (adds interview_unsafe_gaps_remain).
- Add an Interview closure mode taxonomy table covering None / ledger_only / safe_default.
tests/unit/auto/test_interview_pipeline.py
- Extend the existing safe-default success test to assert state.interview_closure_mode == "safe_default".
- Add a new test for the partial-unsafe path that constructs a CONFLICTING ledger entry on one section and asserts the new typed code, the rollback invariant, and the preserved user-recorded entry.
- Extend the existing unsafe-everything test (case C) with a regression guard that last_error_code stays interview_max_rounds_exhausted.

Test plan

uv run pytest tests/unit/auto/test_interview_pipeline.py -k "safe_default or unsafe_gaps or finalize" -q → 39 passed
uv run pytest tests/unit/auto tests/integration/auto -q → 908 passed (baseline 907 + 1 new)
uv run ruff check on touched files → clean
uv run ruff format --check on touched files → clean
uv run mypy src/ouroboros/auto/interview_driver.py → clean

Refs #1157 (L4 lane), #821 (autonomy acceptance matrix), #1138 (PR-A instrumentation), #1148 (PR-B1 ledger_only), #1151 (PR-E stop_reason_code).

PR-B2 of the L4 Auto Envelope v2 freeze (Q00#1157, Q00#821). ## Summary Close two gaps observed in live ``ooo auto`` runs against the Q00#821 acceptance matrix where the safe-default policy could have produced a clean closure but the result envelope did not reflect what happened: - The existing safe-default success path closed the interview via line ~470 of ``interview_driver.py`` but never set ``state.interview_closure_mode``. Callers could not distinguish ``mutual_agreement`` (default ``None``) from a safe-default-applied closure. PR-B2 tags the envelope with ``interview_closure_mode = "safe_default"`` on that path. - The partial safe-default case — some required gaps were safely defaultable but at least one remained unsafe (e.g. a CONFLICTING ledger entry, a per-section unsafe-context flag) — used to fall through to the generic ``interview_max_rounds_exhausted`` blocker with no structured event and no rollback. PR-B2 routes it to a dedicated event ``auto.interview.safe_default_partial_unsafe_gaps`` and a new typed code ``interview_unsafe_gaps_remain``; the partial defaults are rolled back via the existing ``_revert_safe_default_entries`` helper because synthesis was never pushed to the backend transcript (same invariant as the synthesis-failure rollback already in place). The genuine-deadlock case (nothing defaultable at all) is preserved unchanged — it continues to emit ``interview_max_rounds_exhausted``. ## Scope - ``src/ouroboros/auto/interview_driver.py``: tag ``safe_default`` closure_mode on the existing success branch; insert the new partial-unsafe branch before the existing ``ledger_done`` event. - ``skills/auto/SKILL.md``: extend the stop_reason_code taxonomy (8 codes now) and add an Interview closure mode taxonomy table. - ``tests/unit/auto/test_interview_pipeline.py``: - Extend the existing safe-default success test to assert ``state.interview_closure_mode == "safe_default"``. - Add a new test for the partial-unsafe path that constructs a CONFLICTING ledger entry on one section and asserts the new typed code, the rollback invariant, and the preserved user-recorded entry. - Extend the existing unsafe-everything test (case C) with a regression guard that ``last_error_code`` stays ``interview_max_rounds_exhausted``. No schema change, no manifest change, no new envelope field. Pure behavior + envelope-tag refinement on existing fields shipped by Q00#1148 and Q00#1151. ## Test plan - ``uv run pytest tests/unit/auto/test_interview_pipeline.py -k "safe_default or unsafe_gaps or finalize" -q`` → 39 passed - ``uv run pytest tests/unit/auto tests/integration/auto -q`` → 908 passed (baseline 907 + 1 new) - ``uv run ruff check`` on touched files → clean - ``uv run ruff format --check`` on touched files → clean - ``uv run mypy src/ouroboros/auto/interview_driver.py`` → clean Refs Q00#1157 (L4 lane), Q00#821 (autonomy acceptance matrix), Q00#1138 (PR-A instrumentation), Q00#1148 (PR-B1 ledger_only), Q00#1151 (PR-E stop_reason_code). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: APPROVE

Reviewing commit c2f34e8 for PR #1167

Review record: 30bb7c29-1ab2-4b29-9013-abbefa631dfa

Blocking Findings

No in-scope blocking findings remained after policy filtering.

Non-blocking Suggestions

None.

Design Notes

Unable to complete the review: every attempt to read the provided diff, changed-file list, comments, or source files failed before execution because the sandbox wrapper cannot create a user namespace (bwrap: No permissions to create a new namespace). I did not inspect the PR contents, so I cannot provide a valid architectural assessment.

Recovery Notes

First recoverable review artifact generated from codex analysis log.

Reviewed by ouroboros-agent[bot] via Codex deep analysis

Q00 · 2026-05-25T09:27:30Z

Follow-up policy note from #1219 / PR #1220: if the safe-default synthesis is successfully written to the persisted interview transcript but the backend returns a non-terminal follow-up turn, the auto driver should now fail forward as a safe_default closure. The ledger is already structurally complete and mirrored into the transcript, so rolling defaults back creates the observed cli-todo false block. The remaining fail-closed path is transcript-sync failure before the synthesis is persisted.

Posted by agentos-roadmap-warden — bot. Reply with /warden ignore to suppress further comments on this thread.

ouroboros-agent Bot approved these changes May 22, 2026

View reviewed changes

shaun0927 merged commit d9ea9d1 into Q00:main May 22, 2026
10 of 11 checks passed

shaun0927 mentioned this pull request May 22, 2026

feat(auto): additive assumption_sources provenance surface (PR-C2) #1169

Merged

5 tasks

Q00 mentioned this pull request May 22, 2026

Meta SSOT: AgentOS roadmap sequencing (#920–#960) #961

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(auto): safe-default closure mode + partial-unsafe blocker code (PR-B2)#1167

feat(auto): safe-default closure mode + partial-unsafe blocker code (PR-B2)#1167
shaun0927 merged 1 commit into
Q00:mainfrom
shaun0927:feat/auto-interview-safe-default-closure

shaun0927 commented May 22, 2026

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

Uh oh!

Q00 commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shaun0927 commented May 22, 2026

Summary

Why this scope

What is NOT done here

Scope

Test plan

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Blocking Findings

Non-blocking Suggestions

Design Notes

Recovery Notes

Uh oh!

Uh oh!

Q00 commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants