A /pre-loop skill for Claude Code + Codex that turns "keep going until it's done" from a token-burning gamble into an engineered, self-verifying loop β plus a runnable demo you can put on screen in 60 seconds.
This is how most people run an AI coding agent: prompt it, read the output, spot what's wrong, correct it, run it again. You're the trigger, the memory, and the verifier β spending your attention on a job the machine can do itself.
flowchart LR
subgraph BEFORE["β You are the loop"]
direction TB
Y["π§ You"] -->|prompt| A["π€ Agent acts"]
A -->|"output to eyeball"| Y
end
subgraph AFTER["β
The loop runs itself"]
direction TB
G["π― Goal + contract"] --> A2["π€ Agent acts"]
A2 --> V{"verifier:<br/>passes?"}
V -->|no, retry| A2
V -->|yes| S["β
stop"]
end
BEFORE -.->|"loop engineering"| AFTER
Loop engineering is handing that loop to the machine: give it a goal, a way to check its own work, and a place to stop. Done right, you go from babysitting one agent to designing loops that run themselves. But there's a catch nobody warns you about. π
A loop is only as good as the contract you give it. Point an agent at "keep going until the tests pass" with no real contract, and you hit the four classic failure modes:
| Failure mode | What it looks like | |
|---|---|---|
| π³οΈ | No real verifier | "Done" can't be checked, so the agent stops on vibes β confident, wrong output. |
| π | Gamed verifier | It deletes the failing test or hard-codes the answer. The check goes green; the work isn't done. |
| π | Drift | No scope, no architecture β it "helpfully" rewrites half your app and breaks three other things. |
| π₯ | No brakes | No budget, no stop condition, no isolation β it runs forever, or wrecks your working tree. |
A gamed verifier is worse than no verifier β it hands you false confidence at scale.
The fix isn't a better prompt. It's a better contract. That's what this repo builds for you.
Two things you can run today:
| 1 | π§ The /pre-loop skill |
Designs the loop before you run it β reads your repo, locks in the architecture, builds and guards a real verifier, sets the guardrails, and hands you a ready-to-run /goal. Works in Claude Code and Codex. |
| 2 | π¬ A 60-second demo | A tiny project with 3 failing tests, so you can watch a loop fix real bugs and stop on its own β with a one-command reset between takes. |
Instead of writing a giant prompt from memory, you type /pre-loop and it runs as an interactive wizard:
flowchart TD
Start(["You: 'loop this'"]) --> Read["π Read the repo<br/>stack Β· conventions Β· build & test commands"]
Read --> Arch["ποΈ Lock in the architecture<br/>where it fits Β· patterns Β· interfaces Β· constraints"]
Arch --> Ver["π― Build + guard the verifier"]
Ver --> V1["βοΈ scaffold one<br/>if it's missing"]
Ver --> V2["π‘οΈ anti-gaming<br/>guards"]
Ver --> V3["π pre-flight:<br/>run it once"]
V1 --> Guard
V2 --> Guard
V3 --> Guard["π§° Guardrails<br/>branch Β· budget Β· stop Β· rollback"]
Guard --> Fit{"Good fit<br/>to loop?"}
Fit -->|no| No["π« Here's exactly why β<br/>fix this first"]
Fit -->|yes| Brief["π loop-brief.md<br/>+ a ready-to-run /goal"]
Brief --> Run([βΆοΈ Run the loop])
What that gives the loop that a raw prompt never does:
- π Full context, inferred β it reads
CLAUDE.md/AGENTS.md, the README, manifests, lint/test/CI config and nearby code before asking you anything. (Claude Code asks the gaps through its question UI; Codex asks in the terminal.) - ποΈ Architecture that fits β where the work lives, the patterns to mirror, the interfaces it can't break, the things to avoid. No durable context file? It offers to write one.
- π― A verifier you can trust β the full verifier stack (tests Β· types Β· lint Β· build Β· review), scaffolded if it doesn't exist, with anti-gaming guards, and pre-flighted (run once to prove it actually works).
- π§° Brakes and a seatbelt β an isolated branch/worktree, a rollback, a budget, stop/abort conditions, and a stuck-detector.
- π« The honesty to say "don't" β if it isn't a good fit, it tells you why and what to fix first.
The output is an operable loop: verified, self-correcting progress with a guaranteed stop.
stateDiagram-v2
[*] --> Context: trigger (the goal)
Context --> Action: read the brief + state
Action --> Verify: run the verifier stack
Verify --> Done: β
passes
Verify --> Action: β not yet β retry
Verify --> Abort: π budget / no progress
Done --> [*]: ship + evidence
Abort --> [*]: surface the blocker
The verifier is the gate β it's what lets the loop run without you, and what gets you out of the chair.
This isn't a thought experiment. We pointed /pre-loop at the real Getting Automated content library β 69 published guides, videos, and workflows β and let the loop it designed run on a branch. Genuine before/after:
| Errors | Verdict | |
|---|---|---|
| Before | 20 | β FAIL |
| After | 0 | β PASS |
What the loop actually did:
- Fixed 14 structural issues β
relatedContentstored as a bare string, 8 videos missing anid, a workflow missing its---frontmatter fence β frontmatter only, every body byte-identical to the original. - Caught a bug in its own verifier during the pre-flight (6 false positives) and fixed the validator, not the content.
- Refused to invent the 67 facts it was missing (publish dates, tags, categories, thumbnails) β flagged every one for human review instead of hallucinating to go green.
That last point is the whole game: a lazy "make it pass" loop fakes the data; this one stopped cold at the line between structure (safe to fix) and facts (a human's call).
β The full run β the wizard session, the contract, the real diff, and the actual validator it built: example/content-library-loop/
The easiest install is no install at all. Clone the repo and open your agent inside it β the skill auto-loads as a project skill (it lives in .claude/skills/ and .agents/skills/, which Claude Code and Codex both scan automatically):
git clone https://github.com/Getting-Automated/loop-engineering-skill-and-example
cd loop-engineering-skill-and-example
claude # or: codexThen run /pre-loop (Claude Code) or $pre-loop (Codex). No copying, no config.
Want it available in every project?
- Claude Code β
cp -r .claude/skills/pre-loop ~/.claude/skills/, then/pre-loop - Codex β from a Codex session,
$skill-installer install https://github.com/Getting-Automated/loop-engineering-skill-and-example/tree/main/.agents/skills/pre-loop(orcp -r .agents/skills/pre-loop ~/.agents/skills/), then$pre-loop
Run the 60-second demo β in the repo you just cloned:
cd example/pricing
pip install -r requirements.txt
pytest -q # π 3 failedNow hand it to a loop β in Claude Code (claude) or Codex (codex), same command:
/goal "every test passes β pytest -q exits 0 β by fixing the bugs in pricing.py, not by editing test_pricing.py"
Watch it read the failing tests, fix pricing.py, re-run pytest, and stop on green. That's the whole thing: action β verifier β stop.
Reset between takes:
./reset.sh # restores the bugs, clears caches, confirms 3 failedSelf-contained β no git required, so it works live on stream every time.
"Couldn't a good model just one-shot this?" Honestly β probably. Three bugs in one file is small enough that a strong model often fixes it in a single pass. That's the tradeoff of a 90-second demo: a problem genuinely too big to one-shot is also too big to show in 90 seconds.
The point was never the number of tries β it's the pattern. The agent checked its own work against a verifier it couldn't fake and stopped on proof, not on "good enough." Even a one-shot run still stops on proof.
Now extrapolate. The same loop is what carries a multi-file migration, a flaky-test hunt across a whole suite, a forty-file refactor, or an overnight job you can't babysit β work that one-shotting won't reliably land, or that you wouldn't want to one-shot because you need the receipts. The demo is small so you can watch the whole loop run end to end. The value is everything it scales to.
Every loop runs against a loop-brief.md β the contract the agent executes against and the verifier checks against. That single artifact is the leverage point:
flowchart LR
Brief["π loop-brief.md<br/>(the contract)"]
Brief -->|"what to build,<br/>how it must fit"| Loop["π The loop<br/>executes it"]
Brief -->|"what 'done' and<br/>'good' mean"| Ver["π― The verifier<br/>checks against it"]
Loop --> Ver
Ver -->|proof| Ship["π¦ Done + evidence"]
It scales to the task β a one-file fix fills three sections; a new subsystem fills them all:
- π
loop-brief-template.mdβ the blank contract (intent Β· scope Β· architecture Β· plan Β· verifier stack + anti-goals Β· safety Β· health). - ποΈ
loop-brief-example.mdβ the template filled in for a real, non-trivial task (adding rate limiting to a public API), so you can see the depth. - π¬
example/content-library-loop/β a real/pre-looprun, start to finish: the wizard session + the brief it wrote for a content-QA loop, grounded in an actual business-context repo.
/pre-loop will stop you β on purpose β when a loop is the wrong tool:
- β There's no real verifier and one can't be built β you'd just get confident wrong answers.
- β The agent would have to guess the facts (no source of truth).
- β It's a deterministic rule β write code, not a loop.
- β Irreversible side effects (money, prod, deletes, outbound messages) with no approval gate.
- β Nobody will review the output β a loop that piles up unreviewed work just creates review debt.
.
βββ .claude/skills/pre-loop/SKILL.md # the skill, for Claude Code
βββ .agents/skills/pre-loop/SKILL.md # the skill, for Codex
βββ loop-brief-template.md # the blank contract
βββ loop-brief-example.md # a filled, non-trivial contract
βββ example/pricing/ # the runnable demo
βββ pricing.py # 3 intentional bugs
βββ test_pricing.py # the verifier (source of truth)
βββ loop-brief.md # the contract for this demo
βββ reset.sh # restore the bugs between takes
Generation is easy. Verified, self-correcting progress is the whole point.
The leaders building these tools have already made the jump β "I don't prompt Claude anymore. I write loops, and the loops do the work. My job is to write loops." The skill in this repo is how you make that practical without the loop drifting, gaming its tests, or running off a cliff: a loop is only as good as the contract it executes against β so write the contract first.
Built alongside the Getting Automated Loop Engineering explainer on YouTube. If this saved you a few burned token-hours, β the repo.
- Website β gettingautomated.com
- YouTube β @hunterasneed
- Free automation tools β tools.gettingautomated.com
If you want loops wired into your real workflow β client delivery, content operations, internal automation β grab a slot and we'll scope it:
β Schedule a 30-Minute Connect
MIT β see LICENSE. Use it, fork it, ship it.
