refactor(execute): execution-unit cleanup + release-readiness fixes#106
Merged
Conversation
- drop per-unit tasks.json (canonical sprint-root copy is enough) - drop prior-evaluations/ β sibling evaluator output now renders inline inside tasks.md under a `### Evaluator output` block per task - update evaluator workspace breadcrumbs, port docs, and ARCHITECTURE.md storage tree to match
- CLAUDE.md Β§ Evaluator Pattern: full critique now points at execution/<unit-slug>/rounds/<N>/evaluator/evaluation.md; Task.evaluationFile + Task.evaluationOutput noted. - ARCHITECTURE.md Β§ session.md audit: standalone sprint evaluate no longer writes to evaluations/session-<task-id>.md β it writes into execution/<unit-slug>/rounds/standalone-<ISO>/evaluator/. - ARCHITECTURE.md Β§ Harness Signals EvaluationSignal row: drop sidecar claim; note tasks.json preview + EvaluateAndFixLoopUseCase per-round write path. - REQUIREMENTS.md Β§ Harness-owned output writes + Β§ Evaluator Pattern: replace evaluations/<taskId>.md with per-round path in both.
Bundle of audit fixes against 0b6eece..HEAD so v0.6.2 users can upgrade without losing sprints and the chain trace + adapter contracts match the tests that fence them. - Legacy `sprint.json` files load. The Zod schema now defaults `setupRanAt` to `{}`, so files written by v0.6.2 (carrying `checkRanAt`, no `setupRanAt`) parse successfully; the obsolete key is silently stripped on the next save and the file self-cleans without a migration. Two new sprint-schema tests cover the legacy round-trip and the no-audit-key-at-all variant. - Spawn-level setup failures hard-abort. The `setup-scripts-sprint-start` leaf now wraps the `runSetupScript` call in try/catch; ENOENT, EPERM, and friends surface as `InvalidStateError({ currentState: 'setup-failed' })` naming the failing repo, identical to a non-zero exit. The soft `OnError` fallback (`setup-scripts-sprint-start-noop`) is gone β a broken baseline must fail loudly, not paper over a half-installed dependency tree. The `soft-degrades` test flips to `hard-aborts`; the trace fence asserts no `-noop` / `-degraded` step appears anywhere. - Sprint resume skips already-stamped repos. The leaf no-ops per repo when `Sprint.setupRanAt[repoPath]` already carries a timestamp; the existing stamp is preserved (no churn). New tests cover the pure-resume case and the mixed (one stamped, one fresh) case. - Identity-leaf helper consolidated. Single shared `noopLeaf<TCtx>(name)` at `src/application/chains/leaves/noop-leaf.ts`; the duplicate `noopLeafExec` / `noopLeaf` definitions in `execute-flow.ts` and `per-task-flow.ts` are removed. - Tester agent memory line 232 now references `setupRanAt` (the post- rename field). - Docs synced. CHANGELOG Unreleased rewrites the spawn-error bullet for the setup gate and adds resume-skip + legacy-load entries; CLAUDE.md and ARCHITECTURE.md tighten the "non-zero exit hard-aborts" wording to "any setup failure (non-zero exit or spawn-level error)". Verification: pnpm typecheck && pnpm lint && pnpm test β 233 files, 2262 tests pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three commits on the back of the v0.6.2 setup/check-split branch:
refactor(execute): slim per-task execution unit folderβ drops per-unittasks.jsonandprior-evaluations/; the canonical sprint-root copy + the inline### Evaluator outputblock intasks.mdcover both.docs: replace stale evaluations/<taskId>.md path with per-round truthβ CLAUDE.md / ARCHITECTURE.md / REQUIREMENTS.md now point atexecution/<unit-slug>/rounds/<N>/evaluator/evaluation.md(the actual fileEvaluateAndFixLoopUseCasewrites) instead of the legacy sidecar path.fix(execute): release-readiness fixes for setup/check splitβ six audit findings against0b6eece4..HEADso v0.6.2 users can upgrade without losing sprints and the chain trace + adapter contracts match the tests that fence them.Release-readiness fixes (last commit)
sprint.jsonfiles load. Zod schema defaultssetupRanAtto{}; v0.6.2 files carryingcheckRanAt(the legacy key) parse successfully and self-clean on the next save. Two new sprint-schema tests cover the round-trip and the no-audit-key-at-all variant.setup-scripts-sprint-startnow wraps therunSetupScriptcall intry/catch; ENOENT, EPERM, missing binary all surface asInvalidStateError({ currentState: 'setup-failed' })naming the failing repo β identical to a non-zero exit. The softOnErrorfallback (setup-scripts-sprint-start-noop) is gone β a broken baseline must fail loudly.Sprint.setupRanAt[repoPath]already carries a timestamp; the existing stamp is preserved.noopLeaf<TCtx>(name)atsrc/application/chains/leaves/noop-leaf.ts; the duplicatenoopLeafExec/noopLeafdefinitions inexecute-flow.tsandper-task-flow.tsare removed.setupRanAt(post-rename field).Pipeline (post-merge)
Test plan
pnpm typecheckcleanpnpm lintcleanpnpm testβ 233 files, 2262 tests pass (4 new: legacy-checkRanAt round-trip, no-audit-key fallback, hard-aborts on spawn error, skip-if-stamped + mixed-stamps)grep -rn "function noopLeaf" src/returns exactly one match (the new shared module)grep -rn "setup-scripts-sprint-start-noop" src/returns only the negative-assertion lines in the new testRALPHCTL_ROOT=/tmp/legacy ralphctl sprint showagainst a hand-crafted v0.6.2sprint.jsonwithcheckRanAt: {}: should load without errorsprint startmid-task, re-run: setup script does NOT re-execute on the stamped reposetupScriptat a non-existent binary: chain hard-aborts naming the failing repo