polyflow

A skill that teaches Claude, Codex, Copilot, and Gemini to author workflows
Deterministic multi-agent orchestration scripts under plain JavaScript control flow.

Design principle: topology before coding. Pick the workflow shape first (fan-out, pipeline, or loop), capture it in a workflow spec, then write the workflow file.

A workflow is a JavaScript file. The loops, the conditionals, the fan-out are ordinary code that you control. Only the leaf agent() calls spend model tokens, and each one runs in its own clean context window. The result is multi-agent work that behaves the same way every run and can be resumed if it stops partway.

This skill carries the file format, the judgement calls, and a tested authoring procedure, so you can just ask Claude to "create a workflow for X" and get a correct, runnable file back.

New in 0.3.0

Project-local runtime/plugin implementation for the portable cross-platform spec.
Strict runtime contract validation for project-only, file-backed, and no shared state behavior.
Eval benchmarking that records quality, determinism, and replay/resume gains against the baseline path.
CI coverage for tooling validation, portable spec validation, and runtime plugin smoke tests.

Path	What it is
`SKILL.md`	Skill entry point: the procedure Claude follows to design and write a workflow
`references/api-reference.md`	Complete manual: every global, every option, every cap and constant
`references/patterns.md`	Copy-paste orchestration patterns (fan-out, pipeline, loop-until-budget, judge panel, and more)
`assets/templates/`	Starter files for the three core shapes: fan-out, pipeline, loop
`assets/examples/`	Seven complete runnable example workflows
`scripts/validate-workflow.mjs`	Linter: checks a workflow file against the parser's hard rules before you run it
`scripts/validate-workflow-spec.mjs`	Spec validator: checks required workflow-spec sections and sign-off completeness
`scripts/estimate-cost.mjs`	Static estimator: projects agent count, fan-out/loop shape, and rough run cost
`scripts/scaffold-evals.mjs`	Generates eval scaffolding from `evals/evals.json`
`examples/cross-platform/`	Portable canonical spec and per-platform adapter notes (Claude / Codex / Copilot / Gemini / Kilo / Gumloop / OpenCode)
`evals/evals.json`	Starter evaluation test cases

Install

git clone https://github.com/hiranp/polyflow.git
mkdir -p ~/.claude/skills
cp -R polyflow ~/.claude/skills/polyflow

The next time Claude Code starts the skill is available.

Quick start

1 -- Enable the Workflow tool

The tool is off by default and requires an environment variable:

# per session
export CLAUDE_CODE_WORKFLOWS=1
claude

Or set it permanently in .claude/settings.local.json:

{ "env": { "CLAUDE_CODE_WORKFLOWS": "1" } }

2 -- Ask an AI to build a workflow

"Create a workflow that reviews my branch across bugs, security, and tests, then verifies each finding."

Claude (or any supported AI) uses this skill to design, write, and validate the file, then runs it. Watch live progress with /workflows.

3 -- Create and validate the workflow spec first

Use assets/templates/workflow-spec.template.md to decide topology/barrier/schema/verification before writing JavaScript.

node scripts/validate-workflow-spec.mjs <path-to-WORKFLOW-SPEC.md>

Fix every reported issue before writing the workflow file.

4 -- Lint before running

node ~/.claude/skills/polyflow/scripts/validate-workflow.mjs <path-to-file.js>

Exit 0 means the file is clean. Fix any reported errors before invoking the workflow.

5 -- Estimate cost before long runs

node ~/.claude/skills/polyflow/scripts/estimate-cost.mjs <path-to-file.js>

This gives a static estimate of model mix, fan-out/loop amplification, and rough per-run cost range.

How it works

You --> AI (with polyflow skill)
              |
              +-- designs the topology (fan-out / pipeline / loop)
              +-- writes a .js workflow file
              +-- calls Workflow({ scriptPath })
                        |
                        +-- agent("task A")  <- fresh context, own token budget
                        +-- agent("task B")  <- fresh context, own token budget
                        +-- agent("task C")  <- fresh context, own token budget

Each agent() call runs in isolation -- no shared state, no context bleed. The orchestration logic (loops, conditions, fan-out) is plain JavaScript you can read and audit.

Cross-platform support

Polyflow workflows are portable. The examples/cross-platform/ directory contains:

A canonical portable spec (portable-skill-spec.json)
A concrete project-local runtime/plugin implementation (runtime-file-backed-plugin.mjs)
Per-platform runner guides for Claude Code, Codex, Copilot, Gemini, Kilo Code, Gumloop, and OpenCode
An adapter conformance checklist

The current architect decision is to merge the runtime/plugin track now and carry broader memory/compression work as follow-up issues. See docs/RUNTIME-PLUGIN-DECISION.md.

Platform guidance for OpenCode, Copilot, and Gemini

The most useful early lesson is not "add hidden memory everywhere". It is to make context management intentional: keep one project thread of work alive, externalize state to stable artifacts, and let one layer own context reduction.

For polyflow, that translates into these portable rules:

Keep one top-level session per project when possible, but run leaf work units from explicit artifacts rather than from ambient chat state.
Pin a small set of orientation files up front: the workflow spec, the portable spec, and any architecture notes that define constraints.
Use a read-only recall step before expensive work and a summarize/persist step after the run. Keep both project-local and file-backed.
If a harness or plugin already manages compaction, do not stack a second compaction system on top of it unless responsibilities are clearly separated.

OpenCode

Use examples/cross-platform/runner-opencode-runbook.md as the harness-specific starting point.
Keep the outer OpenCode session attached to one project so artifact paths and architectural context stay stable.
If you use an external context manager, disable overlapping built-in compaction so cached prefixes and deferred reductions do not fight each other.
Favor project-root files such as .polyflow/runs/ and .planning/ over session-only notes when you need replay, resume, or auditability.

GitHub Copilot

Use examples/cross-platform/runner-copilot-runbook.md as the base runbook.
Treat each review or verify unit as a fresh task with explicit JSON inputs and outputs.
Keep prompts short and schema-first; use one final synthesis pass instead of repeatedly restating the whole plan.
Re-read pinned files at stage boundaries instead of relying on long conversational carryover.

Gemini

Use examples/cross-platform/runner-gemini-runbook.md as the base runbook.
Prefer one clean session per unit of work and aggregate through files between stages.
Validate schema after every stage, retry once on mismatch, and checkpoint outputs before starting the next pass.
Keep stop conditions explicit (maxUnits, maxTokens, maxDuration) so long-running loops fail predictably.

Example workflows

Seven production-quality examples live in assets/examples/:

Workflow	Pattern
`review-branch.js`	pipeline + nested parallel
`implement-and-review.js`	do/while loop with schema-driven exit
`triage-sentry.js`	list to pipeline with MCP tool call
`dead-code-sweep.js`	loop-until-dry with dry-streak counter
`api-contract-drift-detector.js`	fan-out with deliberate barrier
`customer-feedback-theme-extractor.js`	parallel to barrier to cluster
`software-dev-pipeline.js`	spec-intake to recall to execute/verify with scoped memory persist

Requirements

Claude Code 2.1.149+ (or compatible Codex / Copilot / Gemini runtime)
Node.js 18+ (for validation, estimation, and eval scaffold scripts)

Contributing

Pull requests are welcome. See CONTRIBUTING.md for guidelines.

License

MIT

Credits

Inspired by Claude's workflow capabilities and Ray Amjad. Built with community feedback, testing, and ideas.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github		.github
assets		assets
docs		docs
evals		evals
examples		examples
references		references
scripts		scripts
.editorconfig		.editorconfig
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SKILL.md		SKILL.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

polyflow

New in 0.3.0

Contents

Install

Quick start

1 -- Enable the Workflow tool

2 -- Ask an AI to build a workflow

3 -- Create and validate the workflow spec first

4 -- Lint before running

5 -- Estimate cost before long runs

How it works

Cross-platform support

Platform guidance for OpenCode, Copilot, and Gemini

OpenCode

GitHub Copilot

Gemini

Example workflows

Requirements

Contributing

License

Credits

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

polyflow

New in 0.3.0

Contents

Install

Quick start

1 -- Enable the Workflow tool

2 -- Ask an AI to build a workflow

3 -- Create and validate the workflow spec first

4 -- Lint before running

5 -- Estimate cost before long runs

How it works

Cross-platform support

Platform guidance for OpenCode, Copilot, and Gemini

OpenCode

GitHub Copilot

Gemini

Example workflows

Requirements

Contributing

License

Credits

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages