fix(orchestrator): bound workflow lifecycle reason_code/refs and nesting depth#1144
Conversation
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: APPROVE
Reviewing commit
3311444for PR #1144
Review record:
bd22a2a4-b4ad-4d5a-880f-de66735d598f
Blocking Findings
No in-scope blocking findings remained after policy filtering.
Non-blocking Suggestions
None.
Design Notes
Unable to assess architecture or changed behavior because the provided review artifacts and source snapshot were not readable in this execution environment.
Policy Notes
- Omitted 1 finding(s) that referenced files outside the current PR changed-files scope.
Recovery Notes
First recoverable review artifact generated from codex analysis log.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
|
Thanks for the careful read — you were right that Follow-up commit 3b2788a:
Kept the other four bounds from this PR (workflow_id, reason_code, ref char length, nesting depth) since they are all reachable. No schema_version bump and no force push. |
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: APPROVE
Branch: fix/wave2-h2-workflow-lifecycle-bounds | 2 files, +127/-6 | CI: Bridge TypeScript pass 14s https://github.com/Q00/ouroboros/actions/runs/26138958243/job/76880166589
Scope: diff-only
HEAD checked: 1d56bd5d16a105c386ced289a2e407095fac4ced
What Improved
- Added reachable validator bounds for
workflow_id,reason_code, per-ref string length, and lifecycledatanesting depth insrc/ouroboros/orchestrator/workflow_lifecycle.py. - Removed the earlier dead
MAX_WORKFLOW_LIFECYCLE_REFS_ENTRIES = 64path and now tests the operativeMAX_WORKFLOW_LIFECYCLE_REF_COUNT = 16ceiling. - Added focused negative tests for each new lifecycle hardening boundary.
Issue #956 Requirements
| Requirement | Status |
|---|---|
| Lifecycle payloads must be bounded and replay-safe. | MET for this PR scope — data bytes, ref count/bytes, workflow id length, reason code length, ref character length, and nesting depth are enforced at src/ouroboros/orchestrator/workflow_lifecycle.py:30 through src/ouroboros/orchestrator/workflow_lifecycle.py:37, src/ouroboros/orchestrator/workflow_lifecycle.py:425, src/ouroboros/orchestrator/workflow_lifecycle.py:441, and src/ouroboros/orchestrator/workflow_lifecycle.py:524. |
| Raw prompts, stdout, stderr, credentials, and free-form secrets do not belong in lifecycle events. | PRESERVED — replay-unsafe keys/refs remain rejected at src/ouroboros/orchestrator/workflow_lifecycle.py:49 and src/ouroboros/orchestrator/workflow_lifecycle.py:350; existing tests cover this at tests/unit/orchestrator/test_workflow_lifecycle_events.py:526. |
| Lifecycle persistence/replay should remain through EventStore without consumer contract churn. | PRESERVED — append_workflow_lifecycle_event and replay_workflow_lifecycle remain unchanged in behavior at src/ouroboros/persistence/event_store.py:1038 and src/ouroboros/persistence/event_store.py:1096. |
| No Microsoft Agent Framework / Azure / DurableTask dependency. | MET — changed files add no dependency surface; existing no-external-SDK direction is preserved. |
| Full #956 Workflow IR substrate acceptance criteria. | NOT CLAIMED BY THIS PR — #956 remains open; this PR is a narrow lifecycle hardening slice and does not break the existing Workflow IR boundary. |
Prior Findings Status
| Prior Finding | Status |
|---|---|
| Previous bot review reported no in-scope blockers. | MAINTAINED — current HEAD review found no verified blockers. |
Earlier review/comment thread flagged MAX_WORKFLOW_LIFECYCLE_REFS_ENTRIES = 64 as unreachable behind the existing 16-ref limit. |
WITHDRAWN — current HEAD removed that constant/guard; the operative limit remains enforced at src/ouroboros/orchestrator/workflow_lifecycle.py:445 and tested at tests/unit/orchestrator/test_workflow_lifecycle_events.py:691. |
Blockers
| # | File:Line | Severity | Confidence | Finding |
|---|
Follow-ups
| # | File:Line | Priority | Confidence | Suggestion |
|---|
Test Coverage
Focused lifecycle coverage is adequate for the changed logic. The new tests in tests/unit/orchestrator/test_workflow_lifecycle_events.py:669, tests/unit/orchestrator/test_workflow_lifecycle_events.py:681, tests/unit/orchestrator/test_workflow_lifecycle_events.py:691, tests/unit/orchestrator/test_workflow_lifecycle_events.py:714, and tests/unit/orchestrator/test_workflow_lifecycle_events.py:726 cover all newly added validators and the nesting-depth guard. Verified with PYTHONPATH=src uv run --no-project ... python -m pytest tests/unit/orchestrator/test_workflow_lifecycle_events.py -q: 36 passed.
Design / Roadmap Gate
Aligned. design_context.md:20 through design_context.md:22 frames the Workflow IR direction and no-MAF constraint, while design_context.md:153 through design_context.md:163 identifies this PR as a narrow bounded-validator hardening slice. Current HEAD evidence matches that: only lifecycle model/test files changed, with no new runtime dispatch, schema expansion, plugin surface, or external workflow dependency.
Merge Recommendation
- Merge-ready.
ouroboros-agent[bot]
PR #1144 merge-readiness recheckI rechecked this PR against the AgentOS SSOT direction from #961, the Wave 2 H2 scope in #1142, and the #956 Workflow IR v1 boundary. What this PR doesThis is a narrow H2 hardening slice for the durable workflow lifecycle event family:
AgentOS / SSOT alignmentI do not see this PR as directionally wrong or over-engineered:
Bot/design-note stateCurrent GitHub state on head
Validation performedLocal validation from a clean PR worktree using the repository venv: GitHub checks are also green: Ruff, MyPy, Python 3.12/3.13/3.14, Bridge TypeScript, enforce-envelope, and enforce-boundary. Merge recommendationRecommendation: merge-ready. The implementation is small, bounded, tested, and aligned with the AgentOS SSOT / #956 v1 boundary. I do not recommend another code change or empty trigger commit for this PR unless a future bot run produces a new substantive finding. |
pr-review analysis — PR #1144Verdict: APPROVE Review scopeReviewed the effective PR diff against base
Also cross-checked the PR against the #961 AgentOS SSOT, #1142 Wave 2 H2 intent, and the #956 Workflow IR v1 boundary. FindingsNo blocking findings. Detailed assessment
Validation evidenceFinal recommendationAPPROVE / merge-ready. No additional code changes are recommended for this PR at the current head. |
|
re review ping |
1 similar comment
|
re review ping |
|
@ouroboros-agent re review ping |
|
re review ping |
2 similar comments
|
re review ping |
|
re review ping |
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Automated queue-authoritative review run completed for this PR.
Summary
- Trigger: manual.requeue
- Queue reason: manual.requeue
- Review kind: full
- GitHub review event: COMMENT
- Merge eligible: false
Blockers
None identified in this automated pass.
Design / Roadmap Gate
Review scope uses affected-boundary analysis, not changed-lines-only. State machine, persistence, replay, and consumer contract surfaces remain in scope even when the exact lines are outside the diff.
Review-Metadata:
verdict: REQUEST_CHANGES
github_event: COMMENT
review_kind: full
merge_eligible: false
head_sha: 1d56bd5
source_read_ok: true
diff_read_ok: true
blocking_count: 0
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Metadata
| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|---|
| PR | #1144 |
| HEAD checked | 1d56bd5d16a105c386ced289a2e407095fac4ced |
| Request ID | req_requeue_all_1779669182_1144 |
| Review record | 2735e3b0-641d-4d16-a2bf-c44cc9877444 |
What Improved
- Adds explicit lifecycle persistence-boundary limits for workflow_id, reason_code, ref string length, and nested data depth.
Issue Requirements
| Requirement | Status |
|---|---|
| Wave 2 #1142 / #956 lifecycle persistence hardening | Partially satisfied; bounds were added, but the refs validator now has an unbounded iterable consumption path before enforcing the ref-count ceiling. |
Prior Findings Status
Prior dead REFS_ENTRIES=64 concern appears addressed and is not re-raised. The current blocker is a new refs iterable materialization regression in the current head.
Blockers
| # | File:Line | Severity | Finding |
|---|---|---|---|
| 1 | src/ouroboros/orchestrator/workflow_lifecycle.py:560 | BLOCKING | _coerce_refs now does materialized = tuple(value) before the existing MAX_WORKFLOW_LIFECYCLE_REF_COUNT guard runs. Since refs accepts any Iterable, a very large or non-terminating generator is fully consumed before the validator rejects entry 17, turning the storage-DoS hardening into an unbounded memory/time path. The previous streaming _normalize_refs(value) path rejected after the limit. Keep the char-length check inside the bounded iteration path, and add a generator/large-iterable regression test. |
Follow-up Findings
| # | File:Line | Priority | Confidence | Suggestion |
|---|---|---|---|---|
| None. |
Non-blocking Suggestions
None.
Test Coverage Notes
- Reviewed
tests/unit/orchestrator/test_workflow_lifecycle_events.py; new tests cover tuple/list-style oversized refs but not large or non-terminating iterables. - Attempted to run the lifecycle tests, but this environment only has Python 3.11.2 and no pytest; the repo requires Python >=3.12 and hits syntax unsupported by 3.11.
Design Notes
The change is appropriately narrow and stays inside the lifecycle event persistence contract, but the refs validator should preserve streaming bounded validation.
Design / Roadmap Gate
Affected boundary is WorkflowLifecycleEvent construction before EventStore.append_workflow_lifecycle_event. This model is the durable lifecycle persistence gate, so validators must reject malformed or oversized inputs without first consuming unbounded caller-provided iterables.
Directional Notes
Maintainer memory pushed review focus toward storage-boundary integrity and avoiding duplicate/drifting validation paths. The blocker is based on current source evidence, not memory.
Test Coverage
- Reviewed
tests/unit/orchestrator/test_workflow_lifecycle_events.py; new tests cover tuple/list-style oversized refs but not large or non-terminating iterables. - Attempted to run the lifecycle tests, but this environment only has Python 3.11.2 and no pytest; the repo requires Python >=3.12 and hits syntax unsupported by 3.11.
Merge Recommendation
Do not merge until refs validation enforces the 16-entry ceiling while iterating, before full materialization, with coverage for generator or large-iterable input.
Review-Metadata:
verdict: REQUEST_CHANGES
head_sha: 1d56bd5
request_id: req_requeue_all_1779669182_1144
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Keep WorkflowLifecycleEvent refs validation streaming so oversized iterables are rejected at the configured count boundary without full materialization. Ref character-length validation now runs inside the same bounded normalization path.\n\nServices: shared\nAffected files:\n- src/ouroboros/orchestrator/workflow_lifecycle.py\n- tests/unit/orchestrator/test_workflow_lifecycle_events.py
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: APPROVE
Metadata
| Field | Value |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|---|
| PR | #1144 |
| HEAD checked | fc8ad7b63f615996cd4defbf8360015b8e6afa61 |
| Request ID | req_1779704190_13 |
| Review record | cab0ac3e-e7d2-45ff-9e2b-cd5bf41b36a1 |
What Improved
- Adds bounded validation for lifecycle
workflow_id,reason_code, individualrefs, and nesteddata. - Preserves the existing 16-ref ceiling and adds regression coverage for generator inputs so oversized iterables are rejected during bounded iteration.
Issue Requirements
| Requirement | Status |
|---|---|
| Add lifecycle persistence-boundary bounds for workflow id, reason code, refs, and nested data | Met |
| Preserve existing ref-count ceiling as the operative refs-entry limit | Met |
| Reject large/generator refs without full materialization | Met |
| Keep EventStore lifecycle persistence/replay contract unchanged | Met |
| Avoid broader runtime dispatch, schema, dependency, or AgentOS surface changes | Met |
Prior Findings Status
Prior dead MAX_WORKFLOW_LIFECYCLE_REFS_ENTRIES = 64 concern remains withdrawn: current HEAD uses the operative MAX_WORKFLOW_LIFECYCLE_REF_COUNT limit at src/ouroboros/orchestrator/workflow_lifecycle.py:445. The later unbounded refs materialization blocker is fixed in this snapshot: _coerce_refs now delegates directly to streaming _normalize_refs at src/ouroboros/orchestrator/workflow_lifecycle.py:559, and the generator regression test asserts only 17 yielded items at tests/unit/orchestrator/test_workflow_lifecycle_events.py:714.
Blockers
No in-scope blocking findings remained after policy filtering.
Follow-up Findings
| # | File:Line | Priority | Confidence | Suggestion |
|---|---|---|---|---|
| None. |
Non-blocking Suggestions
None.
Test Coverage Notes
- Reviewed
tests/unit/orchestrator/test_workflow_lifecycle_events.py; new tests cover oversizedreason_code, oversizedworkflow_id, ref count, large generator refs without full materialization, per-ref character limit, and nested data depth. - Attempted local execution, but this environment only has Python 3.11.2, no
pythonbinary, nopytest, and the project requires Python>=3.12.
Design Notes
The change is narrow and stays inside WorkflowLifecycleEvent validation plus targeted tests. Persistence still flows through WorkflowLifecycleEvent.to_base_event() and EventStore.append, with no new storage model or runtime dispatch behavior.
Design / Roadmap Gate
Affected boundary is lifecycle event construction before durable append/replay. The validators now bound caller-controlled strings, refs iterables, serialized refs/data size, and recursive data traversal before the event reaches EventStore.append_workflow_lifecycle_event. Compatibility is preserved for normal bounded lifecycle rows; malformed persisted rows remain rejected on rehydration through WorkflowLifecycleEvent.from_base_event.
Directional Notes
Maintainer memory focused the review on storage-boundary integrity and duplicate validation paths. The current assessment is based on source evidence in the snapshot, not memory alone.
Test Coverage
- Reviewed
tests/unit/orchestrator/test_workflow_lifecycle_events.py; new tests cover oversizedreason_code, oversizedworkflow_id, ref count, large generator refs without full materialization, per-ref character limit, and nested data depth. - Attempted local execution, but this environment only has Python 3.11.2, no
pythonbinary, nopytest, and the project requires Python>=3.12.
Merge Recommendation
Merge-ready. I found no current HEAD blocking issues in the changed boundary, and the prior materialization regression has explicit source and test coverage evidence.
Review-Metadata:
verdict: APPROVE
head_sha: fc8ad7b
request_id: req_1779704190_13
review_profile: memory-aware-zero-trust-v2
advisory_memory_only: true
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Summary
Wave 2 hardening for the #956 Workflow IR lifecycle event family. Addresses the security review of PR #1134, which flagged the lifecycle event model as accepting unbounded
reason_code,refs, andworkflow_id(storage DoS vector — HIGH), and_normalize_json_valueas recursing without a depth cap (RecursionError under extreme nesting — MEDIUM).Scope is intentionally narrow: validator-level bounds only, no schema bump, no blocklist additions, no changes to event consumers,
replay_workflow_lifecycle, orappend_workflow_lifecycle_event.Changes
Finalbounds nearMAX_WORKFLOW_LIFECYCLE_DATA_BYTES:MAX_WORKFLOW_LIFECYCLE_WORKFLOW_ID_LEN = 256MAX_WORKFLOW_LIFECYCLE_REASON_CODE_LEN = 1024MAX_WORKFLOW_LIFECYCLE_REFS_ENTRIES = 64MAX_WORKFLOW_LIFECYCLE_REF_LEN_CHARS = 512_MAX_NESTING_DEPTH = 32WorkflowLifecycleEvent._workflow_id_non_blank: rejects values longer than 256 chars after strip.WorkflowLifecycleEvent._optional_fields_non_blank: rejectsreason_codelonger than 1024 chars.WorkflowLifecycleEvent._coerce_refs: materializes the iterable once, rejects > 64 entries, and rejects any individualstrref over 512 chars at the validator entry point (defense-in-depth in front of the existing byte-count check in_normalize_refs)._normalize_json_valuenow accepts_depth: int = 0and raisesValueErrorwhen_depth > _MAX_NESTING_DEPTH. Recursive calls pass_depth=_depth + 1. This preventsRecursionErroron adversarial deeply-nesteddatapayloads.Test plan
python -m pytest tests/unit/orchestrator/test_workflow_lifecycle_events.py -q— 36 passed (31 existing + 5 new negative tests)python -m pytest tests/unit/orchestrator/ -q— 1651 passed, 1 skipped (no regressions)python -m pytest tests/conformance/workflow_ir/ -q— 43 passedreason_code, oversizedworkflow_id, > 64 refs entries, individual ref > 512 chars, and a 33-level nested dict rejected cleanly withValidationError(noRecursionError).Refs #1142, #956. Closes the #1134 security HIGH finding on unbounded fields and MEDIUM on nesting depth.