feat: add WhaleFlow — declarative multi-agent workflow orchestration#2482
feat: add WhaleFlow — declarative multi-agent workflow orchestration#2482AdityaVG13 wants to merge 5 commits into
Conversation
New crate crates/whaleflow providing declarative JSON-config-driven sub-agent swarm orchestration for CodeWhale. Inspired by Claude Code's Dynamic Workflows (Opus 4.8, May 2026). - WorkflowConfig JSON schema with phases, tasks, dependencies - Topological scheduler with semaphore-based concurrency control - File-scope conflict detection for parallel write safety - Git worktree isolation per task (create → extract → apply → clean) - Structured WorkflowResult with per-task cost/token tracking - workflow_run tool schema for model invocation - TUI integration via WhaleFlowSpawner (SubAgentManager bridge) - 18 tests: 15 unit + 3 integration
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
…_blocking - cwd_path() now returns worktree path for Worktree variant (was dead code) - parallel phases now honor Abort failure policy - WorktreeManager git calls wrapped in tokio::spawn_blocking - timeout_secs wired end-to-end with tokio::time::timeout on polling loop - AgentSpawner trait extended with timeout_secs/max_steps parameters - WorkflowRunTool no longer claims ReadOnly capability - unknown agent_type now logs a warning instead of silently defaulting Addresses Greptile review: P1 (blocking Command), P2 (dead timeout_secs)
| pub fn extract_changes(task_id: &str, workspace: &Path) -> Result<String, SpawnError> { | ||
| let relative = format!(".worktrees/whaleflow-{}", task_id); | ||
| let worktree_path = workspace.join(&relative); | ||
|
|
||
| let output = Command::new("git") | ||
| .arg("-C") | ||
| .arg(&worktree_path) | ||
| .arg("diff") | ||
| .arg("HEAD") | ||
| .output() | ||
| .map_err(|e| { | ||
| SpawnError::WorktreeError(format!("git diff in worktree failed: {}", e)) | ||
| })?; | ||
|
|
||
| if !output.status.success() { | ||
| let stderr = String::from_utf8_lossy(&output.stderr); | ||
| return Err(SpawnError::WorktreeError(format!( | ||
| "git diff in worktree failed: {}", | ||
| stderr.trim() | ||
| ))); | ||
| } | ||
|
|
||
| Ok(String::from_utf8_lossy(&output.stdout).into_owned()) |
There was a problem hiding this comment.
git diff HEAD misses committed changes — data loss for committing sub-agents
extract_changes runs git diff HEAD inside the worktree, which compares the working directory against the worktree's current HEAD. If a sub-agent commits any of its work (advancing HEAD in the worktree), git diff HEAD only captures uncommitted changes after the last commit — all committed changes between the worktree's initial HEAD and its final HEAD are excluded from the patch. When the worktree is then removed, those committed changes are permanently lost.
The fix is to record the initial HEAD SHA when create is called, then use git diff <initial_sha> (working-tree vs initial commit) or git diff <initial_sha>..HEAD (all commits + unstaged) in extract_changes to capture the full set of changes.
… tests - max_steps flows through SubAgentSpawnOptions to per-agent step budget - extract_changes errors now logged instead of silently ignored - files_touched populated from worktree diff output - TaskStatus re-exported from whaleflow crate - 3 new tests: abort in parallel phase, abort stops subsequent phases, timeout_secs/max_steps deserialization Addresses Greptile P1 (silent failure on extract_changes)
| let handle = tokio::spawn(async move { | ||
| let _permit = sem.acquire().await; | ||
| spawner | ||
| .spawn(task_id_for_closure, prompt, task.agent_type.clone(), cwd, timeout_secs, max_steps) | ||
| .await | ||
| }); | ||
| handles.push((task_id, handle)); | ||
| } | ||
|
|
There was a problem hiding this comment.
Spawned tasks are detached, not cancelled, on
Abort in parallel phase
When a task fails and FailurePolicy::Abort triggers inside the for (task_id, handle) in handles loop, the loop breaks and the remaining JoinHandles in handles are dropped. Dropping a tokio::spawn JoinHandle in Tokio detaches the task — the underlying async task continues running to completion. For Shared-isolation ReadWrite tasks this means additional sub-agents keep writing to the main workspace after the scheduler has already reported the workflow as Aborted, producing changes the orchestrator never learns about and that can conflict with subsequent workflow runs. The integration test passes because the MockSpawner returns instantly, so the detached task finishes before the test asserts anything, masking the real-world behavior.
|
Cool feature! I can't wait to play with it. |
|
Thanks @AdityaVG13. I did a fresh v0.9 stewardship pass on this and #2486. This is the right WhaleFlow direction and I want to preserve it as the source branch for the workflow runner work, but I am not going to merge or directly harvest the full PR into the current v0.9 integration branch as-is. Current release evidence:
Safe path from here: keep this PR open as the intent/source branch, then land WhaleFlow in smaller maintainer slices against the v0.9 branch. The first viable slice should be something like typed config/IR validation plus deterministic scheduler tests, behind a feature/config gate and without exposing a write-capable Thanks also @mo-vic for the product interest here. The feature is exciting; the restraint is only because workflow orchestration is exactly the kind of surface where a partial merge can write to the wrong place or leave agent work running after the UI says it stopped. |
Adds @AdityaVG13 to the contribution-gate allowlist now that WhaleFlow #2482/#2486 have been harvested into the maintained v0.9 IR/TraceStore foundation with public credit.
|
Thanks @AdityaVG13. As part of v0.9 stewardship, I opened #2821 as a narrow maintainer harvest of the safe typed-IR naming surface from this WhaleFlow direction. What was harvested: explicit What remains intentionally out of scope for that maintainer PR: runtime |
Add the explicit WorkflowSpec/WorkflowNode metadata surface requested for the v0.9 WhaleFlow IR, including budget, permission, model, and promotion policy records plus serde roundtrip coverage. Runtime execution, replay, and worktree application remain out of scope. Refs #2668, #2482, #2486. Co-authored-by: AdityaVG13 <44177453+AdityaVG13@users.noreply.github.com>
|
Thanks again @AdityaVG13. I opened #2823 as another narrow v0.9 maintainer harvest from this WhaleFlow direction. What #2823 takes: a crate-local mock executor skeleton over What remains intentionally out of scope: live |
Add a crate-local mock executor over WorkflowSpec that records leaf, branch, and control-node results for Sequence, BranchSet, Leaf, Reduce, TeacherReview, LoopUntil, Cond, and Expand. Add reducer scaffolding for BranchTournament and ParetoFrontier, plus #2669 acceptance-style tests, without exposing workflow_run, spawning agents, or applying worktrees. Refs #2669. Harvests narrow WhaleFlow executor intent from #2482/#2486. Co-authored-by: AdityaVG13 <44177453+AdityaVG13@users.noreply.github.com>
|
v0.9 stewardship update: I opened #2829 as another narrow WhaleFlow maintainer slice inspired by the broader workflow direction here. #2829 adds crate-only deterministic replay from recorded leaf/control records, including stable leaf input hashes and Thanks @AdityaVG13 for the WhaleFlow draft direction; the credit is preserved in the #2829 PR body and changelog. |
|
v0.9 stewardship update: I opened #2830 as a crate-only model-policy slice for WhaleFlow. #2830 adds role/capability model selection, mock provider plumbing, and fail-closed JSON repair parsing without live provider calls or runtime provider switching. It keeps the broader #2672 provider-routing/adapter work out of scope while preserving the broader WhaleFlow draft direction here. Thanks @AdityaVG13; the changelog and PR body keep the broader WhaleFlow draft credit trail intact. |
|
Another narrow WhaleFlow foundation slice is up in #2831. It keeps the work inside This still does not claim the full runtime workflow mode, live provider replay, or shared persistence pieces from this broader proposal. It is intended as CI-backed scaffolding we can build on safely for v0.9.0. Thanks @AdityaVG13 for the draft architecture and cost-tracking direction that shaped these WhaleFlow slices. |
|
Follow-up v0.9 stewardship update: #2833 adds another crate-only WhaleFlow foundation slice. New in #2833:
This still does not expose |
New crate: crates/whaleflow providing declarative JSON-config-driven sub-agent swarm orchestration for CodeWhale. Inspired by Claude Code's Dynamic Workflows
Summary
Testing
cargo fmt --all -- --checkcargo clippy --workspace --all-targets --all-featurescargo test --workspace --all-featuresChecklist
Greptile Summary
This PR adds
crates/whaleflow, a new declarative multi-agent orchestration layer that lets the model drive sub-agent swarms through a JSON workflow config. The TUI crate is extended with aworkflow_runtool backed byWhaleFlowSpawner, which translates the scheduler's phase/task graph intoSubAgentManagercalls with optional git-worktree isolation.WorkflowConfigschema.IsolationMode::Worktreecorrectly returnsSome(path)fromcwd_path(), and git operations inWhaleFlowSpawnerare correctly offloaded totokio::task::spawn_blocking.timeout_secs/max_steps: both fields are now fully wired —timeout_secswraps the poll loop intokio::time::timeoutandmax_stepsflows throughSubAgentSpawnOptions— but neither appears inWORKFLOW_RUN_SCHEMA, so the model cannot discover or use them.JoinHandlein Tokio detaches rather than cancels the task; whenFailurePolicy::Abortfires mid-phase the remaining handles are dropped and those sub-agents continue running in the background, potentially writing to the shared workspace after the workflow reportsAborted.Confidence Score: 3/5
Safe to merge only after addressing the parallel-abort task-detachment issue; sub-agents for Shared-isolation ReadWrite tasks can continue writing to the workspace after the orchestrator considers the workflow aborted.
The worktree lifecycle, blocking-IO isolation, and timeout wiring are well-implemented. However, the parallel fan-out code drops JoinHandles on abort without cancelling the underlying tokio tasks — those tasks are detached and keep running, potentially making file changes in the main workspace that the scheduler has already declared complete or skipped. This is a real behavioral defect on a core execution path.
crates/whaleflow/src/scheduler.rs (parallel abort detachment) and crates/whaleflow/src/tool.rs (schema missing max_steps/timeout_secs)
Important Files Changed
Sequence Diagram
sequenceDiagram participant M as Model participant WT as WorkflowRunTool participant EW as execute_workflow participant S as Scheduler participant WFS as WhaleFlowSpawner participant WM as WorktreeManager participant SAM as SubAgentManager M->>WT: "workflow_run({config: ...})" WT->>EW: execute_workflow(config_json, spawner) EW->>S: Scheduler::new(config, spawner) EW->>S: run() loop for each phase (topo order) S->>S: build_prompt(task) — inject upstream results alt parallel phase par for each task S->>WFS: spawn(task_id, prompt, cwd, timeout, max_steps) alt "isolation = worktree" WFS->>WM: create(task_id, workspace) via spawn_blocking WM-->>WFS: worktree_path end WFS->>SAM: spawn_background_with_assignment_options loop poll 250ms WFS->>SAM: get_result(agent_id) SAM-->>WFS: status end alt completed and worktree WFS->>WM: extract_changes (git diff HEAD) WFS->>WM: apply_patch (git apply) WFS->>WM: remove (git worktree remove) end WFS-->>S: AgentResult end else sequential phase S->>WFS: spawn(...) WFS-->>S: AgentResult end end S-->>EW: WorkflowResult EW-->>WT: result JSON WT-->>M: ToolResult(success, json)Comments Outside Diff (3)
crates/whaleflow/src/worktree.rs, line 2029-2036 (link)std::process::Commandcalled from inside async contextWorktreeManager::create,extract_changes,apply_patch, andremoveall usestd::process::Command::output()/wait_with_output(), which are blocking syscalls. These methods are called directly inside theasync fn spawn(...)implementation ofWhaleFlowSpawner, which itself runs on a tokio worker thread (viatokio::spawnin the scheduler's parallel fan-out path). Blocking a tokio worker thread with long-running git operations (especiallygit worktree addorgit applyon a large repository) can starve the async runtime and degrade all concurrent tasks.The fix is to wrap each
Commandcall intokio::task::spawn_blocking(|| ...)and.awaitthe result, or switch totokio::process::Command.crates/whaleflow/src/config.rs, line 761-784 (link)scopes_overlaphas false positives due to string prefix vs. path prefix mismatchstrip_glob("src/auth/**")yields"src/auth"andstrip_glob("src/auth_admin/**")yields"src/auth_admin". The check"src/auth_admin".starts_with("src/auth")returnstrue(string prefix), so these two entirely disjoint directory scopes are incorrectly flagged as overlapping.std::path::Path::starts_withenforces component boundaries and would correctly returnfalsehere (Path::new("src/auth_admin").starts_with(Path::new("src/auth"))→ false). The impact is a spuriousOverlappingScopeswarning logged atWARNlevel; the workflow continues, but users see misleading diagnostics.crates/whaleflow/src/config.rs, line 635-648 (link)depends_on_resultscan reference tasks in the same parallel phase, silently receiving no dataValidation confirms that
depends_on_resultsIDs exist somewhere in the workflow, but it does not verify that they belong to a prior phase. If taskBin a parallel phase lists taskA(in the same phase) independs_on_results, both tasks are spawned concurrently. When the scheduler callsbuild_promptforBbeforeAhas completed,self.results.get("A")returnsNone, and the prompt silently includes"### A (not available)\n\n"instead of real context. The model sees no error — it just gets empty upstream data, producing subtly wrong behavior with no diagnostic.Reviews (4): Last reviewed commit: "fix(whaleflow): improve scopes_overlap w..." | Re-trigger Greptile