Skip to content

refactor(execute): execution-unit cleanup + release-readiness fixes#106

Merged
lukas-grigis merged 3 commits into
mainfrom
refactor/execution-unit-cleanup
May 6, 2026
Merged

refactor(execute): execution-unit cleanup + release-readiness fixes#106
lukas-grigis merged 3 commits into
mainfrom
refactor/execution-unit-cleanup

Conversation

@lukas-grigis
Copy link
Copy Markdown
Owner

Summary

Three commits on the back of the v0.6.2 setup/check-split branch:

  • refactor(execute): slim per-task execution unit folder β€” drops per-unit tasks.json and prior-evaluations/; the canonical sprint-root copy + the inline ### Evaluator output block in tasks.md cover both.
  • docs: replace stale evaluations/<taskId>.md path with per-round truth β€” CLAUDE.md / ARCHITECTURE.md / REQUIREMENTS.md now point at execution/<unit-slug>/rounds/<N>/evaluator/evaluation.md (the actual file EvaluateAndFixLoopUseCase writes) instead of the legacy sidecar path.
  • fix(execute): release-readiness fixes for setup/check split β€” six audit findings against 0b6eece4..HEAD so v0.6.2 users can upgrade without losing sprints and the chain trace + adapter contracts match the tests that fence them.

Release-readiness fixes (last commit)

  • Legacy sprint.json files load. Zod schema defaults setupRanAt to {}; v0.6.2 files carrying checkRanAt (the legacy key) parse successfully and self-clean on the next save. Two new sprint-schema tests cover the round-trip and the no-audit-key-at-all variant.
  • Spawn-level setup failures hard-abort. setup-scripts-sprint-start now wraps the runSetupScript call in try/catch; ENOENT, EPERM, missing binary all surface as InvalidStateError({ currentState: 'setup-failed' }) naming the failing repo β€” identical to a non-zero exit. The soft OnError fallback (setup-scripts-sprint-start-noop) is gone β€” a broken baseline must fail loudly.
  • Sprint resume skips already-stamped repos. The leaf no-ops per repo when Sprint.setupRanAt[repoPath] already carries a timestamp; the existing stamp is preserved.
  • Identity-leaf helper consolidated. Single shared noopLeaf<TCtx>(name) at src/application/chains/leaves/noop-leaf.ts; the duplicate noopLeafExec / noopLeaf definitions in execute-flow.ts and per-task-flow.ts are removed.
  • Tester agent memory line 232 references setupRanAt (post-rename field).
  • Docs synced β€” CHANGELOG Unreleased, CLAUDE.md, ARCHITECTURE.md, REQUIREMENTS.md.

Pipeline (post-merge)

sprint start
  resolve-branch
  dirty-tree-preflight
  resolve-check-scripts
  setup-scripts-sprint-start    ← now hard-aborts on ANY setup failure
                                   (red exit OR spawn-level error);
                                   skips repos already stamped on
                                   sprint.setupRanAt (resume case)
loop per task:
  branch-preflight
  execute-task
  post-task-check
  evaluate-task
  mark-done

Test plan

  • pnpm typecheck clean
  • pnpm lint clean
  • pnpm test β€” 233 files, 2262 tests pass (4 new: legacy-checkRanAt round-trip, no-audit-key fallback, hard-aborts on spawn error, skip-if-stamped + mixed-stamps)
  • grep -rn "function noopLeaf" src/ returns exactly one match (the new shared module)
  • grep -rn "setup-scripts-sprint-start-noop" src/ returns only the negative-assertion lines in the new test
  • Manual smoke β€” RALPHCTL_ROOT=/tmp/legacy ralphctl sprint show against a hand-crafted v0.6.2 sprint.json with checkRanAt: {}: should load without error
  • Manual smoke β€” kill sprint start mid-task, re-run: setup script does NOT re-execute on the stamped repo
  • Manual smoke β€” point setupScript at a non-existent binary: chain hard-aborts naming the failing repo

- drop per-unit tasks.json (canonical sprint-root copy is enough)
- drop prior-evaluations/ β€” sibling evaluator output now renders inline
  inside tasks.md under a `### Evaluator output` block per task
- update evaluator workspace breadcrumbs, port docs, and ARCHITECTURE.md
  storage tree to match
- CLAUDE.md Β§ Evaluator Pattern: full critique now points at
  execution/<unit-slug>/rounds/<N>/evaluator/evaluation.md;
  Task.evaluationFile + Task.evaluationOutput noted.
- ARCHITECTURE.md Β§ session.md audit: standalone sprint evaluate no longer
  writes to evaluations/session-<task-id>.md β€” it writes into
  execution/<unit-slug>/rounds/standalone-<ISO>/evaluator/.
- ARCHITECTURE.md Β§ Harness Signals EvaluationSignal row: drop sidecar
  claim; note tasks.json preview + EvaluateAndFixLoopUseCase per-round
  write path.
- REQUIREMENTS.md Β§ Harness-owned output writes + Β§ Evaluator Pattern:
  replace evaluations/<taskId>.md with per-round path in both.
Bundle of audit fixes against 0b6eece..HEAD so v0.6.2 users can upgrade
without losing sprints and the chain trace + adapter contracts match the
tests that fence them.

- Legacy `sprint.json` files load. The Zod schema now defaults
  `setupRanAt` to `{}`, so files written by v0.6.2 (carrying `checkRanAt`,
  no `setupRanAt`) parse successfully; the obsolete key is silently
  stripped on the next save and the file self-cleans without a migration.
  Two new sprint-schema tests cover the legacy round-trip and the
  no-audit-key-at-all variant.
- Spawn-level setup failures hard-abort. The `setup-scripts-sprint-start`
  leaf now wraps the `runSetupScript` call in try/catch; ENOENT, EPERM,
  and friends surface as `InvalidStateError({ currentState: 'setup-failed' })`
  naming the failing repo, identical to a non-zero exit. The soft `OnError`
  fallback (`setup-scripts-sprint-start-noop`) is gone β€” a broken baseline
  must fail loudly, not paper over a half-installed dependency tree. The
  `soft-degrades` test flips to `hard-aborts`; the trace fence asserts
  no `-noop` / `-degraded` step appears anywhere.
- Sprint resume skips already-stamped repos. The leaf no-ops per repo when
  `Sprint.setupRanAt[repoPath]` already carries a timestamp; the existing
  stamp is preserved (no churn). New tests cover the pure-resume case and
  the mixed (one stamped, one fresh) case.
- Identity-leaf helper consolidated. Single shared `noopLeaf<TCtx>(name)`
  at `src/application/chains/leaves/noop-leaf.ts`; the duplicate
  `noopLeafExec` / `noopLeaf` definitions in `execute-flow.ts` and
  `per-task-flow.ts` are removed.
- Tester agent memory line 232 now references `setupRanAt` (the post-
  rename field).
- Docs synced. CHANGELOG Unreleased rewrites the spawn-error bullet for
  the setup gate and adds resume-skip + legacy-load entries; CLAUDE.md
  and ARCHITECTURE.md tighten the "non-zero exit hard-aborts" wording to
  "any setup failure (non-zero exit or spawn-level error)".

Verification: pnpm typecheck && pnpm lint && pnpm test β€” 233 files,
2262 tests pass.
@lukas-grigis lukas-grigis merged commit adee553 into main May 6, 2026
1 check passed
@lukas-grigis lukas-grigis deleted the refactor/execution-unit-cleanup branch May 6, 2026 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant