Skip to content

feat(repair): wire the bounded self-repair loop (opt-in)#80

Merged
dadachi merged 1 commit into
mainfrom
feat/self-repair-loop
May 22, 2026
Merged

feat(repair): wire the bounded self-repair loop (opt-in)#80
dadachi merged 1 commit into
mainfrom
feat/self-repair-loop

Conversation

@dadachi
Copy link
Copy Markdown
Contributor

@dadachi dadachi commented May 22, 2026

What

Implements SPEC §5's self-repair loop, which was documented in CLAUDE.md and already rendered by the validation report (repairSection) but never wired — failures just surfaced and the agent exited.

How it works

When the first judge pass fails on a code-repairable layer, the agent iterates:

  1. Pick the highest-priority failure — all Layer 1 leftover-token misses before any Layer 2 build miss (leftover tokens routinely cause the build error).
  2. Run a repair pass (runRepair) — a Claude Agent SDK query() scoped (cwd) to the failing generated project, with Read/Edit/Grep (+ Bash for Layer 2), bypassPermissions, hermetic settingSources: [], bounded maxTurns.
  3. Re-validate that platform (Layer 1 + Layer 2 only).
  4. Record a RepairAttempt.
  5. Repeat until green or the hard cap of 5 (REPAIR_ITERATION_CAP, per CLAUDE.md).

Layer 3 (vision) and the contract reviewer are surfaced, not auto-repaired — a Layer 3 miss is often environmental (cf. #78's notification modal), not a source bug.

Design notes

  • src/repair-loop.ts is pure + dependency-injected (repair / revalidate deps), so the loop logic is fully unit-testable without the LLM or a device.
  • Opt-in via NATIVEAPPTEMPLATE_REPAIR (on or a positive integer, clamped to 5). Off by default — no change to existing behavior. Skipped in stub mode.
  • RepairAttempt[] is threaded dispatch → buildRunReport → RunReport.repairAttempts, populating the existing report section.

Tests (7 new, 56 total green)

Loop resolves after one pass · gives up at the cap · clamps maxIterations to 5 · fixes Layer 1 before Layer 2 · no-ops on an unrepairable Layer 3 miss · report carries/omits repairAttempts.

Honest scope

The loop control flow is unit-tested with injected fakes. The real repair-agent path (runRepair driving the Agent SDK) is opt-in and not exercised in CI. An end-to-end demonstration of the loop fixing a real induced failure (the hackathon stretch goal) is a sensible follow-up — it needs an induced failure + a full run with a live key and devices.

npm run ci green locally (build + typecheck:tests + 56 tests).

🤖 Generated with Claude Code

SPEC §5's 5-iteration self-repair loop was documented in CLAUDE.md and
rendered by the report (repairSection) but never implemented — failures
just surfaced and the agent exited. This wires it end-to-end.

- src/repair-loop.ts: runRepairLoop — pure, dependency-injected control
  flow. Targets the highest-priority code-repairable failure (all Layer 1
  leftover-token misses before any Layer 2 build miss), repairs, re-validates
  that platform, records a RepairAttempt, repeats until green or the cap.
  Hard-capped at REPAIR_ITERATION_CAP=5 (CLAUDE.md). Layer 3 (vision) and
  the contract reviewer are surfaced, not auto-repaired — a Layer 3 miss is
  often environmental, not a source bug.
- src/agents/repair.ts: runRepair — a Claude Agent SDK query() pass scoped
  (cwd) to the failing generated project, Read/Edit/Grep(/Bash for Layer 2),
  bypassPermissions, hermetic settingSources:[], bounded maxTurns. Stub path
  via isStub("repair"); "repair" added to AgentName.
- dispatch: after a failing first judge, if NATIVEAPPTEMPLATE_REPAIR is set
  (on / positive int, clamped to 5), run the loop with real repair +
  per-platform Layer1/Layer2 revalidation, then fold the result back into the
  judge + thread RepairAttempt[] into the report. Off by default; skipped in
  stub mode.
- report: BuildRunReportInput.repairAttempts → RunReport.repairAttempts, so
  the existing repairSection populates.
- tests: 7 new — loop resolves after one pass; gives up at the cap; clamps
  maxIterations to 5; fixes Layer 1 before Layer 2; no-ops on an
  unrepairable Layer 3 miss; report carries/omits repairAttempts.
- docs: SPEC §5 row → Shipped (opt-in); README flag; CLAUDE.md pointer.

The loop logic is unit-tested with injected fakes (no LLM/device). The
real repair-agent path is opt-in and not exercised in CI; an end-to-end
real-failure demo (hackathon stretch) is a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dadachi dadachi merged commit a1637f0 into main May 22, 2026
1 check passed
@dadachi dadachi deleted the feat/self-repair-loop branch May 22, 2026 12:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant