Skip to content

hiranp/polyflow

polyflow

polyflow Logo

A skill that teaches Claude, Codex, Copilot, and Gemini to author workflows
Deterministic multi-agent orchestration scripts under plain JavaScript control flow.

License: MIT Contributions Welcome


Design principle: topology before coding. Pick the workflow shape first (fan-out, pipeline, or loop), capture it in a workflow spec, then write the workflow file.

A workflow is a JavaScript file. The loops, the conditionals, the fan-out are ordinary code that you control. Only the leaf agent() calls spend model tokens, and each one runs in its own clean context window. The result is multi-agent work that behaves the same way every run and can be resumed if it stops partway.

This skill carries the file format, the judgement calls, and a tested authoring procedure, so you can just ask Claude to "create a workflow for X" and get a correct, runnable file back.

New in 0.3.0

  • Project-local runtime/plugin implementation for the portable cross-platform spec.
  • Strict runtime contract validation for project-only, file-backed, and no shared state behavior.
  • Eval benchmarking that records quality, determinism, and replay/resume gains against the baseline path.
  • CI coverage for tooling validation, portable spec validation, and runtime plugin smoke tests.

Contents

Path What it is
SKILL.md Skill entry point: the procedure Claude follows to design and write a workflow
references/api-reference.md Complete manual: every global, every option, every cap and constant
references/patterns.md Copy-paste orchestration patterns (fan-out, pipeline, loop-until-budget, judge panel, and more)
assets/templates/ Starter files for the three core shapes: fan-out, pipeline, loop
assets/examples/ Seven complete runnable example workflows
scripts/validate-workflow.mjs Linter: checks a workflow file against the parser's hard rules before you run it
scripts/validate-workflow-spec.mjs Spec validator: checks required workflow-spec sections and sign-off completeness
scripts/estimate-cost.mjs Static estimator: projects agent count, fan-out/loop shape, and rough run cost
scripts/scaffold-evals.mjs Generates eval scaffolding from evals/evals.json
examples/cross-platform/ Portable canonical spec and per-platform adapter notes (Claude / Codex / Copilot / Gemini / Kilo / Gumloop / OpenCode)
evals/evals.json Starter evaluation test cases

Install

git clone https://github.com/hiranp/polyflow.git
mkdir -p ~/.claude/skills
cp -R polyflow ~/.claude/skills/polyflow

The next time Claude Code starts the skill is available.

Quick start

1 -- Enable the Workflow tool

The tool is off by default and requires an environment variable:

# per session
export CLAUDE_CODE_WORKFLOWS=1
claude

Or set it permanently in .claude/settings.local.json:

{ "env": { "CLAUDE_CODE_WORKFLOWS": "1" } }

2 -- Ask an AI to build a workflow

"Create a workflow that reviews my branch across bugs, security, and tests, then verifies each finding."

Claude (or any supported AI) uses this skill to design, write, and validate the file, then runs it. Watch live progress with /workflows.

3 -- Create and validate the workflow spec first

Use assets/templates/workflow-spec.template.md to decide topology/barrier/schema/verification before writing JavaScript.

node scripts/validate-workflow-spec.mjs <path-to-WORKFLOW-SPEC.md>

Fix every reported issue before writing the workflow file.

4 -- Lint before running

node ~/.claude/skills/polyflow/scripts/validate-workflow.mjs <path-to-file.js>

Exit 0 means the file is clean. Fix any reported errors before invoking the workflow.

5 -- Estimate cost before long runs

node ~/.claude/skills/polyflow/scripts/estimate-cost.mjs <path-to-file.js>

This gives a static estimate of model mix, fan-out/loop amplification, and rough per-run cost range.

How it works

You --> AI (with polyflow skill)
              |
              +-- designs the topology (fan-out / pipeline / loop)
              +-- writes a .js workflow file
              +-- calls Workflow({ scriptPath })
                        |
                        +-- agent("task A")  <- fresh context, own token budget
                        +-- agent("task B")  <- fresh context, own token budget
                        +-- agent("task C")  <- fresh context, own token budget

Each agent() call runs in isolation -- no shared state, no context bleed. The orchestration logic (loops, conditions, fan-out) is plain JavaScript you can read and audit.

Cross-platform support

Polyflow workflows are portable. The examples/cross-platform/ directory contains:

  • A canonical portable spec (portable-skill-spec.json)
  • A concrete project-local runtime/plugin implementation (runtime-file-backed-plugin.mjs)
  • Per-platform runner guides for Claude Code, Codex, Copilot, Gemini, Kilo Code, Gumloop, and OpenCode
  • An adapter conformance checklist

The current architect decision is to merge the runtime/plugin track now and carry broader memory/compression work as follow-up issues. See docs/RUNTIME-PLUGIN-DECISION.md.

Platform guidance for OpenCode, Copilot, and Gemini

The most useful early lesson is not "add hidden memory everywhere". It is to make context management intentional: keep one project thread of work alive, externalize state to stable artifacts, and let one layer own context reduction.

For polyflow, that translates into these portable rules:

  • Keep one top-level session per project when possible, but run leaf work units from explicit artifacts rather than from ambient chat state.
  • Pin a small set of orientation files up front: the workflow spec, the portable spec, and any architecture notes that define constraints.
  • Use a read-only recall step before expensive work and a summarize/persist step after the run. Keep both project-local and file-backed.
  • If a harness or plugin already manages compaction, do not stack a second compaction system on top of it unless responsibilities are clearly separated.

OpenCode

  • Use examples/cross-platform/runner-opencode-runbook.md as the harness-specific starting point.
  • Keep the outer OpenCode session attached to one project so artifact paths and architectural context stay stable.
  • If you use an external context manager, disable overlapping built-in compaction so cached prefixes and deferred reductions do not fight each other.
  • Favor project-root files such as .polyflow/runs/ and .planning/ over session-only notes when you need replay, resume, or auditability.

GitHub Copilot

  • Use examples/cross-platform/runner-copilot-runbook.md as the base runbook.
  • Treat each review or verify unit as a fresh task with explicit JSON inputs and outputs.
  • Keep prompts short and schema-first; use one final synthesis pass instead of repeatedly restating the whole plan.
  • Re-read pinned files at stage boundaries instead of relying on long conversational carryover.

Gemini

  • Use examples/cross-platform/runner-gemini-runbook.md as the base runbook.
  • Prefer one clean session per unit of work and aggregate through files between stages.
  • Validate schema after every stage, retry once on mismatch, and checkpoint outputs before starting the next pass.
  • Keep stop conditions explicit (maxUnits, maxTokens, maxDuration) so long-running loops fail predictably.

Example workflows

Seven production-quality examples live in assets/examples/:

Workflow Pattern
review-branch.js pipeline + nested parallel
implement-and-review.js do/while loop with schema-driven exit
triage-sentry.js list to pipeline with MCP tool call
dead-code-sweep.js loop-until-dry with dry-streak counter
api-contract-drift-detector.js fan-out with deliberate barrier
customer-feedback-theme-extractor.js parallel to barrier to cluster
software-dev-pipeline.js spec-intake to recall to execute/verify with scoped memory persist

Requirements

  • Claude Code 2.1.149+ (or compatible Codex / Copilot / Gemini runtime)
  • Node.js 18+ (for validation, estimation, and eval scaffold scripts)

Contributing

Pull requests are welcome. See CONTRIBUTING.md for guidelines.

License

MIT

Credits

Inspired by Claude's workflow capabilities and Ray Amjad. Built with community feedback, testing, and ideas.

About

A skill that teaches Claude, Codex, Copilot, and Gemini to author deterministic multi-agent workflow scripts

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors