feat(repair): wire the bounded self-repair loop (opt-in)#80
Merged
Conversation
SPEC §5's 5-iteration self-repair loop was documented in CLAUDE.md and
rendered by the report (repairSection) but never implemented — failures
just surfaced and the agent exited. This wires it end-to-end.
- src/repair-loop.ts: runRepairLoop — pure, dependency-injected control
flow. Targets the highest-priority code-repairable failure (all Layer 1
leftover-token misses before any Layer 2 build miss), repairs, re-validates
that platform, records a RepairAttempt, repeats until green or the cap.
Hard-capped at REPAIR_ITERATION_CAP=5 (CLAUDE.md). Layer 3 (vision) and
the contract reviewer are surfaced, not auto-repaired — a Layer 3 miss is
often environmental, not a source bug.
- src/agents/repair.ts: runRepair — a Claude Agent SDK query() pass scoped
(cwd) to the failing generated project, Read/Edit/Grep(/Bash for Layer 2),
bypassPermissions, hermetic settingSources:[], bounded maxTurns. Stub path
via isStub("repair"); "repair" added to AgentName.
- dispatch: after a failing first judge, if NATIVEAPPTEMPLATE_REPAIR is set
(on / positive int, clamped to 5), run the loop with real repair +
per-platform Layer1/Layer2 revalidation, then fold the result back into the
judge + thread RepairAttempt[] into the report. Off by default; skipped in
stub mode.
- report: BuildRunReportInput.repairAttempts → RunReport.repairAttempts, so
the existing repairSection populates.
- tests: 7 new — loop resolves after one pass; gives up at the cap; clamps
maxIterations to 5; fixes Layer 1 before Layer 2; no-ops on an
unrepairable Layer 3 miss; report carries/omits repairAttempts.
- docs: SPEC §5 row → Shipped (opt-in); README flag; CLAUDE.md pointer.
The loop logic is unit-tested with injected fakes (no LLM/device). The
real repair-agent path is opt-in and not exercised in CI; an end-to-end
real-failure demo (hackathon stretch) is a follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Implements SPEC §5's self-repair loop, which was documented in CLAUDE.md and already rendered by the validation report (
repairSection) but never wired — failures just surfaced and the agent exited.How it works
When the first judge pass fails on a code-repairable layer, the agent iterates:
runRepair) — a Claude Agent SDKquery()scoped (cwd) to the failing generated project, with Read/Edit/Grep (+ Bash for Layer 2),bypassPermissions, hermeticsettingSources: [], boundedmaxTurns.RepairAttempt.REPAIR_ITERATION_CAP, per CLAUDE.md).Layer 3 (vision) and the contract reviewer are surfaced, not auto-repaired — a Layer 3 miss is often environmental (cf. #78's notification modal), not a source bug.
Design notes
src/repair-loop.tsis pure + dependency-injected (repair/revalidatedeps), so the loop logic is fully unit-testable without the LLM or a device.NATIVEAPPTEMPLATE_REPAIR(onor a positive integer, clamped to 5). Off by default — no change to existing behavior. Skipped in stub mode.RepairAttempt[]is threadeddispatch → buildRunReport → RunReport.repairAttempts, populating the existing report section.Tests (7 new, 56 total green)
Loop resolves after one pass · gives up at the cap · clamps
maxIterationsto 5 · fixes Layer 1 before Layer 2 · no-ops on an unrepairable Layer 3 miss · report carries/omitsrepairAttempts.Honest scope
The loop control flow is unit-tested with injected fakes. The real repair-agent path (
runRepairdriving the Agent SDK) is opt-in and not exercised in CI. An end-to-end demonstration of the loop fixing a real induced failure (the hackathon stretch goal) is a sensible follow-up — it needs an induced failure + a full run with a live key and devices.npm run cigreen locally (build + typecheck:tests + 56 tests).🤖 Generated with Claude Code