refactor(execute): execution-unit cleanup + release-readiness fixes by lukas-grigis · Pull Request #106 · lukas-grigis/ralphctl

lukas-grigis · 2026-05-06T08:39:53Z

Summary

Three commits on the back of the v0.6.2 setup/check-split branch:

refactor(execute): slim per-task execution unit folder — drops per-unit tasks.json and prior-evaluations/; the canonical sprint-root copy + the inline ### Evaluator output block in tasks.md cover both.
docs: replace stale evaluations/<taskId>.md path with per-round truth — CLAUDE.md / ARCHITECTURE.md / REQUIREMENTS.md now point at execution/<unit-slug>/rounds/<N>/evaluator/evaluation.md (the actual file EvaluateAndFixLoopUseCase writes) instead of the legacy sidecar path.
fix(execute): release-readiness fixes for setup/check split — six audit findings against 0b6eece4..HEAD so v0.6.2 users can upgrade without losing sprints and the chain trace + adapter contracts match the tests that fence them.

Release-readiness fixes (last commit)

Legacy sprint.json files load. Zod schema defaults setupRanAt to {}; v0.6.2 files carrying checkRanAt (the legacy key) parse successfully and self-clean on the next save. Two new sprint-schema tests cover the round-trip and the no-audit-key-at-all variant.
Spawn-level setup failures hard-abort. setup-scripts-sprint-start now wraps the runSetupScript call in try/catch; ENOENT, EPERM, missing binary all surface as InvalidStateError({ currentState: 'setup-failed' }) naming the failing repo — identical to a non-zero exit. The soft OnError fallback (setup-scripts-sprint-start-noop) is gone — a broken baseline must fail loudly.
Sprint resume skips already-stamped repos. The leaf no-ops per repo when Sprint.setupRanAt[repoPath] already carries a timestamp; the existing stamp is preserved.
Identity-leaf helper consolidated. Single shared noopLeaf<TCtx>(name) at src/application/chains/leaves/noop-leaf.ts; the duplicate noopLeafExec / noopLeaf definitions in execute-flow.ts and per-task-flow.ts are removed.
Tester agent memory line 232 references setupRanAt (post-rename field).
Docs synced — CHANGELOG Unreleased, CLAUDE.md, ARCHITECTURE.md, REQUIREMENTS.md.

Pipeline (post-merge)

sprint start
  resolve-branch
  dirty-tree-preflight
  resolve-check-scripts
  setup-scripts-sprint-start    ← now hard-aborts on ANY setup failure
                                   (red exit OR spawn-level error);
                                   skips repos already stamped on
                                   sprint.setupRanAt (resume case)
loop per task:
  branch-preflight
  execute-task
  post-task-check
  evaluate-task
  mark-done

Test plan

pnpm typecheck clean
pnpm lint clean
pnpm test — 233 files, 2262 tests pass (4 new: legacy-checkRanAt round-trip, no-audit-key fallback, hard-aborts on spawn error, skip-if-stamped + mixed-stamps)
grep -rn "function noopLeaf" src/ returns exactly one match (the new shared module)
grep -rn "setup-scripts-sprint-start-noop" src/ returns only the negative-assertion lines in the new test
Manual smoke — RALPHCTL_ROOT=/tmp/legacy ralphctl sprint show against a hand-crafted v0.6.2 sprint.json with checkRanAt: {}: should load without error
Manual smoke — kill sprint start mid-task, re-run: setup script does NOT re-execute on the stamped repo
Manual smoke — point setupScript at a non-existent binary: chain hard-aborts naming the failing repo

- drop per-unit tasks.json (canonical sprint-root copy is enough) - drop prior-evaluations/ — sibling evaluator output now renders inline inside tasks.md under a `### Evaluator output` block per task - update evaluator workspace breadcrumbs, port docs, and ARCHITECTURE.md storage tree to match

- CLAUDE.md § Evaluator Pattern: full critique now points at execution/<unit-slug>/rounds/<N>/evaluator/evaluation.md; Task.evaluationFile + Task.evaluationOutput noted. - ARCHITECTURE.md § session.md audit: standalone sprint evaluate no longer writes to evaluations/session-<task-id>.md — it writes into execution/<unit-slug>/rounds/standalone-<ISO>/evaluator/. - ARCHITECTURE.md § Harness Signals EvaluationSignal row: drop sidecar claim; note tasks.json preview + EvaluateAndFixLoopUseCase per-round write path. - REQUIREMENTS.md § Harness-owned output writes + § Evaluator Pattern: replace evaluations/<taskId>.md with per-round path in both.

Bundle of audit fixes against 0b6eece..HEAD so v0.6.2 users can upgrade without losing sprints and the chain trace + adapter contracts match the tests that fence them. - Legacy `sprint.json` files load. The Zod schema now defaults `setupRanAt` to `{}`, so files written by v0.6.2 (carrying `checkRanAt`, no `setupRanAt`) parse successfully; the obsolete key is silently stripped on the next save and the file self-cleans without a migration. Two new sprint-schema tests cover the legacy round-trip and the no-audit-key-at-all variant. - Spawn-level setup failures hard-abort. The `setup-scripts-sprint-start` leaf now wraps the `runSetupScript` call in try/catch; ENOENT, EPERM, and friends surface as `InvalidStateError({ currentState: 'setup-failed' })` naming the failing repo, identical to a non-zero exit. The soft `OnError` fallback (`setup-scripts-sprint-start-noop`) is gone — a broken baseline must fail loudly, not paper over a half-installed dependency tree. The `soft-degrades` test flips to `hard-aborts`; the trace fence asserts no `-noop` / `-degraded` step appears anywhere. - Sprint resume skips already-stamped repos. The leaf no-ops per repo when `Sprint.setupRanAt[repoPath]` already carries a timestamp; the existing stamp is preserved (no churn). New tests cover the pure-resume case and the mixed (one stamped, one fresh) case. - Identity-leaf helper consolidated. Single shared `noopLeaf<TCtx>(name)` at `src/application/chains/leaves/noop-leaf.ts`; the duplicate `noopLeafExec` / `noopLeaf` definitions in `execute-flow.ts` and `per-task-flow.ts` are removed. - Tester agent memory line 232 now references `setupRanAt` (the post- rename field). - Docs synced. CHANGELOG Unreleased rewrites the spawn-error bullet for the setup gate and adds resume-skip + legacy-load entries; CLAUDE.md and ARCHITECTURE.md tighten the "non-zero exit hard-aborts" wording to "any setup failure (non-zero exit or spawn-level error)". Verification: pnpm typecheck && pnpm lint && pnpm test — 233 files, 2262 tests pass.

lukas-grigis added 3 commits May 6, 2026 08:49

lukas-grigis merged commit adee553 into main May 6, 2026
1 check passed

lukas-grigis deleted the refactor/execution-unit-cleanup branch May 6, 2026 09:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(execute): execution-unit cleanup + release-readiness fixes#106

refactor(execute): execution-unit cleanup + release-readiness fixes#106
lukas-grigis merged 3 commits into
mainfrom
refactor/execution-unit-cleanup

lukas-grigis commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lukas-grigis commented May 6, 2026

Summary

Release-readiness fixes (last commit)

Pipeline (post-merge)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant