catchball

Don't bounce between Codex, Claude and Copilot.

Put tasks under ./tasks, run catchball, and it handles the implement → review → fix loop until everything is clean.

For example, say you created these tasks in the tasks folder:

010-setup.md,
020-build.md,
030-tests.md .

Then ask catchball to coordinate the work:

uv run catchball --worker copilot --reviewer claude --fixer codex

Each task runs through this loop until it passes or fails.

Prerequisites

Python 3.11+ — https://www.python.org/downloads/
uv — https://docs.astral.sh/uv/getting-started/installation/
At least one agent CLI on your PATH:

Quick start

Run from a local checkout:

git clone https://github.com/hosamsh/catchball.git
cd catchball
uv run catchball --worker claude --reviewer codex

To install catchball directly from GitHub as a standalone command:

uv tool install git+https://github.com/hosamsh/catchball.git
catchball --worker claude --reviewer codex

If you already cloned the repo and want the same standalone command from this checkout:

uv tool install .
catchball --worker claude --reviewer codex

To run against a project in a different folder:

catchball --project-root ../my-app --worker claude --reviewer codex

Agents run inside project-root and pick up any existing harness automatically (AGENTS.md, CLAUDE.md, etc.)

FAQ

What tools are supported?

Claude Code, OpenAI Codex, GitHub Copilot, and OpenCode. You can use one for everything or mix and match as you wish.

What goes in a task file?

Plain markdown. Describe what you want built, fixed, or changed. catchball attaches the task file path to the worker as the prompt. One clear goal per file works best.

How does the review loop work?

Worker runs first, then the reviewer. If the reviewer finds nothing to flag, the task passes. If it writes issues to a .review file, the fixer (or worker) gets a fix round. The fixer can either fix the issues or write a .response file for any issues it is intentionally pushing back on. Before the next review pass, catchball archives the active .review and .response together for that round, then the reviewer reads both before deciding what still stands. This repeats up to --review-passes times (default 3).

How do I pick a model?

Use --worker-model, --reviewer-model, or --fixer-model:

catchball --worker claude --worker-model opus --reviewer codex --reviewer-model gpt-5.4

OpenCode uses its native provider/model naming, for example:

catchball --worker opencode --worker-model anthropic/claude-sonnet-4-5 --reviewer codex

Can I set the effort level?

If the tool supports it:

catchball --worker claude --reviewer codex --worker-effort high --reviewer-effort medium

Can I provide custom instruction files?

Use --worker-instructions, or --reviewer-instructions:

catchball --worker claude --reviewer codex --worker-instructions ./WORKER.md --reviewer-instructions ./REVIEWER.md

If add WORKER.md or REVIEWER.md exist at repo root, catchball picks them up automatically.

Can I add a dedicated fixer?

--fixer is optional. When set, fix rounds use that agent instead of the worker.

catchball --worker copilot --fixer codex --reviewer claude

OpenCode works there too:

catchball --worker opencode --fixer codex --reviewer claude

If FIXER.md exists at repo root, catchball picks it up. Otherwise it uses WORKER.md as guidance.

Fix rounds keep reviewer and fixer ownership separate: reviewers write .review files, and fixers write .response files only when they are intentionally not fixing a flagged issue in that round.

Can I run catchball from outside the target repo?

Yes, but set the --project-root:

catchball --project-root ../my-app --worker claude --reviewer codex

Agent CLIs run in that folder and relative paths like ./tasks resolve from there.

Can I use a different task folder?

catchball --worker claude --reviewer codex --tasks ./packages/api/tasks

Can I start from a specific task?

Use --from:

catchball --worker claude --reviewer codex --from 020-build.md

Can I change the number of review rounds?

Use --review-passes:

catchball --worker claude --reviewer codex --review-passes 5

--retries is supported as a compatibility alias for extra fix/review loops, but it cannot be combined with --review-passes.

What happens when a task fails to clean after exhausing the review passes?

catchball marks it with a .failed sidecar and stops the whole opertion. Use --continue-despite-failures to keep moving on to next tasks instead.

Can I rerun a task list?

Tasks with a .done marker under catchball-runs/state/ are skipped on reruns. Failed tasks get .failed instead and they are retried again next time.

Each tasks directory gets its own namespaced folder under that shared state tree, so rerun state stays persistent without mixing different task lists together.

The namespace is intentionally flat and readable, based on the tail of the tasks path. For example, .samples/sample-tasks/js-click-game becomes sample-tasks--js-click-game under catchball-runs/state/.

For a completely fresh start, use --reset-state or delete the state folder:

catchball --worker claude --reviewer codex --reset-state

Can I pass extra arguments to a tool?

Use --worker-arg, --fixer-arg, or --reviewer-arg:

catchball --worker claude --reviewer codex --worker-arg "--dangerously-skip-permissions"

Each passthrough flag consumes exactly one following token, even if that token begins with - or --. The --worker-arg=value form also works. Repeat the flag to pass multiple arguments.

Can I add a delay between phases?

Yes. --phase-delay now defaults to 3, and the same delay is also applied between completed tasks. Set it to 0 if you want to disable the pause entirely:

catchball --worker claude --reviewer codex --phase-delay 0

Each completed task also emits a timing summary line with total time plus worker, fixer, reviewer, and delay totals.

Can I run tasks in parallel?

catchball is strictly sequential inside one task list. If some tasks are independent, split them into separate folders and run in separate terminals.

What does `--allow-dirty-worktree` do?

catchball wants a clean git worktree before starting. This flag skips that check.

What are these lock files?

They live under the shared catchball-runs/state/ tree and prevent two runs from working the same task. Stale locks are cleared automatically.

Where do the logs go?

Under <active-root>/catchball-runs/<timestamp>/ — each run gets a log file, worker output, reviewer output, reviews, and fixer responses. Use --state-dir to override.

Run directories use an unambiguous UTC timestamp format.

Is this another Ralph Wiggum?

Same spirit, more structure. catchball is a role-based multi-agent coding loop with fixed implement, review, and fix stages.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
assets		assets
legacy		legacy
tasks		tasks
tests		tests
.gitignore		.gitignore
README.md		README.md
REVIEWER.md		REVIEWER.md
WORKER.md		WORKER.md
catchball.py		catchball.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

catchball

Prerequisites

Quick start

FAQ

What tools are supported?

What goes in a task file?

How does the review loop work?

How do I pick a model?

Can I set the effort level?

Can I provide custom instruction files?

Can I add a dedicated fixer?

Can I run catchball from outside the target repo?

Can I use a different task folder?

Can I start from a specific task?

Can I change the number of review rounds?

What happens when a task fails to clean after exhausing the review passes?

Can I rerun a task list?

Can I pass extra arguments to a tool?

Can I add a delay between phases?

Can I run tasks in parallel?

What does `--allow-dirty-worktree` do?

What are these lock files?

Where do the logs go?

Is this another Ralph Wiggum?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

catchball

Prerequisites

Quick start

FAQ

What tools are supported?

What goes in a task file?

How does the review loop work?

How do I pick a model?

Can I set the effort level?

Can I provide custom instruction files?

Can I add a dedicated fixer?

Can I run catchball from outside the target repo?

Can I use a different task folder?

Can I start from a specific task?

Can I change the number of review rounds?

What happens when a task fails to clean after exhausing the review passes?

Can I rerun a task list?

Can I pass extra arguments to a tool?

Can I add a delay between phases?

Can I run tasks in parallel?

What does --allow-dirty-worktree do?

What are these lock files?

Where do the logs go?

Is this another Ralph Wiggum?

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

What does `--allow-dirty-worktree` do?

Packages