From 602302e8808e58587d1cb59719a9b0c687792d16 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 15:19:59 -0600 Subject: [PATCH 01/22] docs(landlord): design spec for dynamic-workflow runtime Port Claude Code's ultracode / dynamic-workflow capability into @flint/landlord: a script-driven workflow runtime injecting the same hooks (agent/parallel/pipeline/phase/log/args/budget/workflow) on top of Flint primitives. Runtime becomes the package core; orchestrate() is rebuilt as a built-in auto-decompose workflow on top of it. Co-Authored-By: Claude Opus 4.8 (1M context) --- ...05-31-landlord-dynamic-workflows-design.md | 369 ++++++++++++++++++ 1 file changed, 369 insertions(+) create mode 100644 docs/superpowers/specs/2026-05-31-landlord-dynamic-workflows-design.md diff --git a/docs/superpowers/specs/2026-05-31-landlord-dynamic-workflows-design.md b/docs/superpowers/specs/2026-05-31-landlord-dynamic-workflows-design.md new file mode 100644 index 0000000..ad40c66 --- /dev/null +++ b/docs/superpowers/specs/2026-05-31-landlord-dynamic-workflows-design.md @@ -0,0 +1,369 @@ +# Landlord → Dynamic Workflows — Design Spec + +**Date:** 2026-05-31 +**Status:** Approved +**Package:** `@flint/landlord` +**Goal:** Port Claude Code's "ultracode" / dynamic-workflow capability into Flint's `landlord` package — as close to identical as possible — turning Landlord from a declarative auto-decomposition orchestrator into a script-driven workflow runtime that injects the same hooks (`agent`, `parallel`, `pipeline`, `phase`, `log`, `args`, `budget`, `workflow`) with the same semantics, on top of Flint primitives. + +--- + +## 1. Background & gap analysis + +### What Landlord is today + +A **declarative auto-decomposition** orchestrator: + +- `decompose(prompt)` asks the LLM (via a forced `emit_plan` tool) to emit a `Contract[]` — a DAG of worker specs (`role`, `objective`, `subPrompt`, `checkpoints`, `outputSchema`, `dependsOn`, `maxRetries`). +- `resolveOrder()` topologically sorts contracts (DFS, `DependencyCycleError` on cycles). +- `orchestrate()` runs all contracts via `Promise.all` with per-role dependency gates, checkpoint validation (`validateCheckpoint`: ajv JSON-Schema tier + LLM-judge tier), retry-on-eviction up to `maxRetries`, escalation on exhaustion, and artifact handoff (`dep.field` injected into dependents). +- The plan is decided **once** by the model at decompose time, then statically executed. + +### What ultracode / dynamic workflows is (the target) + +A **script-based imperative** orchestrator. The model writes a plain-JS script with real control flow (loops, conditionals, fan-out) that deterministically drives subagents through injected hooks. Defining traits: + +- `export const meta = {...}` (pure literal) + a body using `agent()`, `parallel()`, `pipeline()`, `phase()`, `log()`, `args`, `budget`, `workflow()`. +- `agent(prompt, opts?)` — spawn a subagent; with `schema` it is forced to call a structured-output tool and the validated object is returned; without schema the final text is returned; `null` if skipped. +- `pipeline(items, ...stages)` — each item flows through all stages independently, **no barrier**. +- `parallel(thunks)` — concurrent with a **barrier**; a throwing thunk resolves to `null`. +- Concurrency cap `min(16, cpus-2)`; lifetime agent cap `1000`. +- Structured output validated at the tool-call layer (model retries on mismatch). +- Resume via journaling (`resumeFromRunId` replays the longest unchanged `agent()` prefix). +- Determinism sandbox: `Date.now`/`Math.random`/`new Date` throw inside scripts. +- `opts.model`, `opts.agentType`, `opts.isolation:'worktree'`. +- `workflow(nameOrRef, args)` runs another workflow inline (one level). + +Orchestration is **code, not a declared DAG**. + +### The port + +Build a **workflow runtime** in `@flint/landlord` that injects those exact hooks and executes a workflow, built on Flint's `agent()` / `tool()` / `budget`. The runtime becomes the package core; `orchestrate()` is rebuilt as a built-in workflow on top of it. + +--- + +## 2. Locked decisions + +| # | Decision | Choice | +|---|----------|--------| +| 1 | Authoring model | **Both** — typed `defineWorkflow({meta, run})` for devs **and** `runWorkflowScript(source)` for model-authored JS strings, sharing one runtime core. | +| 2 | Relationship to existing API | **Layer** — runtime is the core; `orchestrate()`/`decompose()`/`runTenant()`/`Contract` preserved and (for `orchestrate`) reimplemented on top. | +| 3 | Fidelity scope | **Maximum** — core hooks + schema + caps + events **plus** resume/journaling, determinism sandbox, agentType registry, per-agent model override, isolation (sandboxed workDir default + optional git-worktree backend). | +| 4 | Model-facing parity | **Yes** — ship `workflowTool()` (a Flint `tool()` exposing the runtime) + `WORKFLOW_TOOL_GUIDE` system prompt. | + +--- + +## 3. Module layout + +New `workflow/` subtree under `packages/landlord/src/`; `orchestrate.ts` rebuilt on it. Existing `decompose.ts`, `contract.ts`, `tenant.ts`, `validate.ts`, `tools/*` are reused. + +``` +packages/landlord/src/ + workflow/ + types.ts # shared types: WorkflowContext, AgentOpts, WorkflowEvent, RuntimeConfig, WorkflowModule, Meta, stores + concurrency.ts # Semaphore(limit=min(16,cpus-2), floor 1) + global agent-cap guard (1000) + budget.ts # WorkflowBudget {total, spent(), remaining()} bridged onto flint Budget + events.ts # event emitter + WorkflowEvent union; maps to onEvent callback + journal.ts # JournalStore iface; memoryJournalStore(); fileJournalStore(dir) → agent-.jsonl; keying + replay + registry.ts # createAgentRegistry() (+ built-ins default/Explore/code-reviewer); createWorkflowRegistry() (named scripts) + isolation.ts # IsolationBackend iface; workdirIsolation (default); gitWorktreeIsolation (optional) + schema.ts # jsonSchema → forced structured_output tool; ajv validate; retry-on-mismatch; returns validated value + agentcall.ts # the agent() hook: flint agent() + schema + agentType + isolation + model + journaling + events + hooks.ts # buildContext(run): assembles {agent,parallel,pipeline,phase,log,args,budget,workflow} + runtime.ts # runWorkflow(module, config): owns run state (counters, journal, budget, signal, phase), invokes run(ctx) + meta.ts # restricted literal-evaluator + Meta validation (name/description/phases/whenToUse/model) + sandbox.ts # determinism sandbox: throwing stubs for Date/Math.random/new Date/process/require/globalThis/fs + script.ts # runWorkflowScript(source, config): parse meta, strip exports, wrap in AsyncFunction, inject hooks+sandbox + define.ts # defineWorkflow({meta, run}) → WorkflowModule (typed authoring path) + tool.ts # workflowTool(config) → flint Tool; WORKFLOW_TOOL_GUIDE; orchestratorAgent() convenience + index.ts # re-exports of the workflow surface + orchestrate.ts # rebuilt: built-in auto-decompose workflow on the runtime; public signature preserved + decompose.ts contract.ts tenant.ts validate.ts # reused (decompose + checkpoints power the built-in workflow) + tools/… # unchanged; standardTools(workDir) is the default agentType toolset + index.ts # exports runtime headline + preserved orchestrate/decompose/runTenant +``` + +`package.json` `exports` gains `"./workflow"` (in addition to `.` and `./tools`); the headline runtime symbols are **also** re-exported from `.` so the package's main entry reads as the workflow runtime. + +--- + +## 4. The hook API + +Identical names and semantics to the Workflow tool. The same object is injected as globals in a string script and passed as `wf` to a typed workflow. + +```ts +type WorkflowContext = { + // no schema → resolves the agent's final assistant text (string) + // with schema → resolves the validated structured object + // null only when a wrapping combinator catches an error (see parallel/pipeline) + agent(prompt: string, opts?: AgentOpts): Promise; + + // BARRIER: awaits all thunks; a thunk that throws (or whose agent errors) resolves to null, never rejects + parallel(thunks: Array<() => Promise>): Promise<(T | null)[]>; + + // NO barrier between stages: each item flows through all stages independently. + // stage signature: (prevResult, originalItem, index). A stage throw drops that item to null and skips its remaining stages. + pipeline(items: unknown[], ...stages: StageFn[]): Promise; + + phase(title: string): void; // starts/switches the current progress group + log(message: string): void; // narrator progress line (emitted as a WorkflowEvent) + + args: unknown; // the value passed as RuntimeConfig.args, verbatim + + budget: { + total: number | null; // token target, or null when unset + spent(): number; // output tokens used this run (main + nested) + remaining(): number; // max(0, total - spent()) or Infinity when total is null + }; + + // run another workflow inline; shares this run's concurrency cap, agent counter, budget, signal, journal. + // one level only — workflow() inside a child throws. + workflow(ref: string | { scriptPath?: string; source?: string }, args?: unknown): Promise; +}; + +type AgentOpts = { + label?: string; // display label override (defaults to a slug of the prompt / phase) + phase?: string; // explicit progress group for this call (avoids races in parallel/pipeline) + schema?: object; // JSON Schema → forced structured output; return value is validated + model?: string; // per-agent model override + isolation?: 'worktree'; // select the git-worktree isolation backend for this agent + agentType?: string; // resolve a preset from the AgentTypeRegistry; composes with schema +}; + +type StageFn = (prev: unknown, originalItem: unknown, index: number) => unknown | Promise; +``` + +`parallel`/`pipeline` are **plain combinators over `agent()`** — they do not themselves call the model; they only schedule the thunks/stages the script provides through the shared semaphore. + +--- + +## 5. Execution semantics + +### 5.1 Concurrency & caps (`concurrency.ts`) + +- A `Semaphore` with `limit = max(1, min(16, os.cpus().length - 2))`. Every `agent()` acquires a slot before running and releases after; excess calls queue. `parallel`/`pipeline` fan work out but only `limit` agents run at once. +- A per-run lifetime counter increments on each `agent()` start; the 1001st throws `AgentCapError` (a `FlintError`-style class with `code: 'workflow.agent_cap'`). The cap is a runaway backstop. + +### 5.2 `agent()` (`agentcall.ts`) + +Order of operations for one call: + +1. **Resume check** — compute `key = { index, hash(prompt, opts) }` (index is the monotonic call counter). If resuming and the journal entry at `index` exists with a matching hash, return its cached result immediately (no model call, no slot). The first mismatch/new index runs live; everything after re-runs live. +2. **Acquire** a concurrency slot; increment + check the agent cap. +3. **Resolve preset** from `agentType` (default = `'default'`): `{ systemPrompt, tools?(workDir), model? }`. +4. **Resolve model**: `opts.model ?? preset.model ?? config.models.default` (the runtime's `RuntimeConfig.models` is `{ default: string; [tier: string]: string }`; `orchestrate()` maps its `tenantModel` → `models.default`, and uses `landlordModel` for the decompose phase). +5. **Isolation**: obtain a `workDir` from the chosen backend (`workdirIsolation` default; `gitWorktreeIsolation` when `opts.isolation === 'worktree'`). Build `tools = preset.tools?.(workDir) ?? standardTools(workDir)`. +6. **Schema** (if present): append a forced `structured_output` tool (`schema.ts`) whose `jsonSchema = opts.schema`; instruct the agent it must call it. ajv-validate the call; on failure return a tool error (`"… does not match schema: . Revise and call structured_output again."`) so the flint `agent()` loop retries; capture the validated value. +7. **Run** flint `agent({ adapter, model, messages:[{role:'system',content:systemPrompt+context},{role:'user',content:prompt}], tools, budget })`. +8. **Emit** `agent_started` before, `agent_complete`/`agent_error` after; **journal** the result (`{index, hash, result}`); release the slot; release/clean the isolation workDir. +9. **Return**: schema → validated object; otherwise final assistant text. A hard error throws out of `agent()` (and is converted to `null` only by a wrapping `parallel`/`pipeline`). + +### 5.3 `parallel` & `pipeline` (`hooks.ts`) + +- `parallel(thunks)` → `Promise.all(thunks.map(run-with-catch))`; each thunk's rejection/`agent()` error becomes `null`. Never rejects. +- `pipeline(items, ...stages)` → for each item, an independent async chain through the stages with **no cross-item barrier**; wall-clock = slowest single-item chain. Stage callback gets `(prev, originalItem, index)`. A throwing stage sets that item's result to `null` and skips its remaining stages. Returns an array aligned to `items`. + +### 5.4 Budget (`budget.ts`) + +Bridged onto flint's shared `Budget` (passed to every `agent()` so usage is cumulative). `WorkflowBudget.total` = the run's token target (`config.budget` token cap, or `null`); `spent()` reads the flint budget's token usage; `remaining()` = `max(0, total - spent())` or `Infinity`. Hitting the cap makes flint `agent()` fail with `BudgetExhausted`, surfaced as a thrown error from `agent()` — enabling `while (budget.total && budget.remaining() > N) {…}` loops and `Math.floor(budget.total / 100_000)` fleet sizing. + +### 5.5 Events (`events.ts`) + +```ts +type WorkflowEvent = + | { type: 'phase_started'; title: string } + | { type: 'log'; message: string } + | { type: 'agent_started'; label: string; phase?: string; agentType: string; model: string } + | { type: 'agent_complete'; label: string; phase?: string; tokens: number } + | { type: 'agent_error'; label: string; phase?: string; error: string } + | { type: 'workflow_complete'; result: unknown }; +``` + +Delivered via `config.onEvent`. `config.signal?: AbortSignal` cancels the run (in-flight agents abort, queued agents are skipped) — the library analogue of background-run + abort. This mirrors the existing `LandlordEvent`/`onEvent` style. + +--- + +## 6. Signature fidelity features + +### 6.1 Resume / journaling (`journal.ts`) + +```ts +interface JournalStore { + append(runId: string, entry: JournalEntry): Promise; + load(runId: string): Promise; +} +type JournalEntry = { index: number; hash: string; result: unknown }; +``` + +- `memoryJournalStore()` (default, non-persistent) and `fileJournalStore(dir)` (writes `agent-.jsonl`). +- `runWorkflowScript(source, { resumeFromRunId, runId, journal })`: on resume, replay the **longest unchanged prefix** — for each `index`, if the stored hash matches the about-to-run `hash(prompt, opts)`, return the cached `result`; the first divergence and everything after runs live. Same script + same args ⇒ 100% hit. +- Replay correctness depends on a deterministic call sequence — guaranteed in string mode by the sandbox (§6.2) and documented as a constraint for typed workflows. + +### 6.2 Determinism sandbox (`sandbox.ts`) + +String scripts execute in an `AsyncFunction` whose parameter list **shadows** nondeterministic / host globals with throwing stubs: `Date` (and `Date.now`), `Math` (a clone with a throwing `random`), `process`, `require`, `globalThis`, `fs`, `import`-like access. Standard pure built-ins (`JSON`, `Array`, `Object`, `Math` minus `random`, etc.) remain. This mirrors the product ("`Date.now()`/`Math.random()`/`new Date()` throw") and is the precondition for §6.1. Typed workflows are not sandboxed (cannot shadow lexical globals) but carry the same documented "no nondeterminism if you want resume" constraint. + +### 6.3 agentType registry (`registry.ts`) + +```ts +type AgentType = { systemPrompt: string; tools?: (workDir: string) => Tool[]; model?: string }; +function createAgentRegistry(types?: Record): AgentTypeRegistry; // merges over built-ins +``` + +Built-ins mirroring Claude Code: + +| Name | Tools | System prompt focus | +|------|-------|---------------------| +| `default` | `standardTools(workDir)` (file r/w, bash, web) | general worker; structured results via tools | +| `Explore` | read-only: `fileReadTool`, read-only `bashTool`, `webFetchTool` | broad search; reads excerpts, returns conclusions, no writes | +| `code-reviewer` | read tools | review for bugs/quality; returns findings | + +`opts.agentType` resolves a preset; when combined with `schema`, the preset's system prompt is used **and** the structured-output instruction is appended (composes). + +### 6.4 Per-agent model override + +`opts.model` wins, then preset `model`, then `config.models.default`. `config.models` carries named tiers (e.g. `{ default, fast }`) that presets/scripts reference by key. + +### 6.5 Isolation (`isolation.ts`) + +```ts +interface IsolationBackend { acquire(runId: string, label: string): Promise<{ workDir: string; release(): Promise }>; } +``` + +- `workdirIsolation(baseDir)` (default): a fresh sandboxed subdir per agent, reusing Landlord's path-guarded tools. `release()` is a no-op (kept for inspection). +- `gitWorktreeIsolation(repoDir, baseDir)`: `git worktree add` a throwaway worktree per agent; `release()` runs `git worktree remove` (auto-removed if unchanged). Selected by `opts.isolation === 'worktree'`; falls back to `workdirIsolation` with a `log()` warning when `repoDir` is not a git repo. + +--- + +## 7. String vs typed authoring + `meta` + +### 7.1 `meta` (`meta.ts`) + +```ts +type Meta = { + name: string; // required + description: string; // required + whenToUse?: string; + model?: string; + phases?: Array<{ title: string; detail?: string; model?: string }>; +}; +``` + +A **restricted literal-evaluator** parses `export const meta = { … }`: only object/array/string/number/boolean/null literals are allowed (no identifiers, calls, spreads, template interpolation), matching the product's "pure literal" rule. Invalid meta → a clear `MetaError` (`code: 'workflow.meta'`). + +### 7.2 String path (`script.ts`) + +`runWorkflowScript(source, config)`: + +1. Extract and parse `export const meta = {…}` via `meta.ts`. +2. Strip the `export const meta` statement and any other `export`/`import` lines (string scripts are not real modules). +3. Wrap the remaining body — which may use top-level `await` and a final `return` — in `new AsyncFunction(...hookNames, ...sandboxStubNames, body)`. +4. Invoke with the hook implementations (`agent`, `parallel`, `pipeline`, `phase`, `log`, `args`, `budget`, `workflow`) and the sandbox stubs. +5. Resolve to the body's `return` value, emit `workflow_complete`. + +### 7.3 Typed path (`define.ts`) + +```ts +function defineWorkflow(def: { meta: Meta; run: (wf: WorkflowContext) => Promise }): WorkflowModule; +``` + +Type-checked, no eval; produces a `WorkflowModule` the runtime executes identically. `runWorkflow(module, config)` is the shared entry both paths funnel into. + +--- + +## 8. `workflowTool` + guide (`tool.ts`) + +```ts +function workflowTool(config: { + adapter: ProviderAdapter; + models: { default: string; [tier: string]: string }; + registry?: WorkflowRegistry; // named saved workflows for workflow(name)/{name} + agentTypes?: AgentTypeRegistry; + journal?: JournalStore; + isolation?: IsolationBackend; + onEvent?: (e: WorkflowEvent) => void; +}): Tool; +``` + +- Flint `tool()` named `workflow`, input `{ script: string; args?: unknown; name?: string; scriptPath?: string; resumeFromRunId?: string }`. +- Handler runs the runtime (`runWorkflowScript` for `script`, or registry lookup for `name`) and returns `{ runId, result }` (result summarized if large). +- `WORKFLOW_TOOL_GUIDE: string` — a system-prompt block adapted from the real Workflow tool description (pipeline-by-default, parallel-is-a-barrier, schema for structured output, adversarial-verify, judge-panel, loop-until-dry, multi-modal sweep, completeness-critic, no-silent-caps). Drop the tool + guide into any `agent()` and that agent authors-and-runs workflows exactly like Claude Code. +- `orchestratorAgent(config)` — convenience that returns a configured `agent()` wired with `workflowTool` + the guide as system prompt. + +--- + +## 9. `orchestrate()` rebuilt on the runtime (`orchestrate.ts`) + +`orchestrate()`, `decompose()`, `runTenant()`, `Contract`, `resolveOrder()`, `DependencyCycleError`, and all existing exported types **keep their signatures**, and the existing `orchestrate.test.ts` must pass unchanged. Internally `orchestrate()` becomes a built-in workflow: + +1. `phase('decompose')` → `decompose(prompt)` → `Contract[]`; `resolveOrder()` for the cycle check (still throws `DependencyCycleError`). +2. Schedule tenants with the existing per-role dependency-gate logic, but run each tenant through the **runtime's `agent()` path** (a `tenant` agentType that applies checkpoints + `maxRetries` via the existing `runTenant` internals) so tenants inherit the semaphore, agent cap, journaling, budget, and events. +3. Map runtime `WorkflowEvent`s back onto the existing `LandlordEvent` names (`tenant_started`, `checkpoint_passed`, `tenant_complete`, `tenant_evicted`, `tenant_escalated`, `job_complete`) so `onEvent` consumers are unaffected. +4. Return the same `OrchestrateResult` shape. + +Net effect: the auto-decompose feature becomes one built-in workflow; nothing in the public API is removed. + +--- + +## 10. Public exports & packaging + +- `src/index.ts` adds the runtime headline: `defineWorkflow`, `runWorkflowScript`, `runWorkflow`, `workflowTool`, `WORKFLOW_TOOL_GUIDE`, `orchestratorAgent`, `createAgentRegistry`, `createWorkflowRegistry`, `memoryJournalStore`, `fileJournalStore`, `workdirIsolation`, `gitWorktreeIsolation`, and all workflow types — alongside the preserved `orchestrate`/`decompose`/`runTenant`/`validateCheckpoint`/`ContractSchema`/`CheckpointSchema` exports. +- `package.json` `exports` adds `"./workflow": { types, import }`. `dependencies` unchanged (`ajv` reused for schema validation; `zod` for contracts). No new runtime deps in `flint` core. +- Biome/TS conventions: ESM, `strict`, `noUncheckedIndexedAccess`, `exactOptionalPropertyTypes`, single quotes, semicolons, trailing commas, `useImportType`, **no default exports** (use named factory functions), `Result` over throws at public boundaries where the existing package already does so (`runWorkflowScript`/`runWorkflow` return `Promise>`; hook-internal errors throw within the script as the product does). + +--- + +## 11. Testing plan + +vitest in `packages/landlord/test/workflow/`, using flint `mockAdapter`/`scriptedAdapter` (no network): + +| File | Covers | +|------|--------| +| `concurrency.test.ts` | semaphore never exceeds `min(16,cpus-2)`; agent cap throws on the 1001st | +| `runtime.test.ts` | `parallel` barrier + throw→null; `pipeline` no-barrier ordering + stage-throw→null; `args`/`phase`/`log` events | +| `schema.test.ts` | forced structured-output tool; ajv validate; retry-on-mismatch returns the corrected value | +| `budget.test.ts` | `total`/`spent()`/`remaining()` math; exhaustion throws from `agent()` | +| `journal.test.ts` | unchanged-prefix replay (cache hit, no model call); first-divergence reruns live; file store JSONL round-trip | +| `sandbox.test.ts` | `Date.now`/`new Date`/`Math.random`/`process` throw inside a script; pure built-ins work; top-level await + return | +| `registry.test.ts` | built-in presets resolve; `agentType` composes with `schema`; custom registry merges over built-ins | +| `isolation.test.ts` | `workdirIsolation` creates distinct sandboxed dirs; worktree backend falls back cleanly outside a git repo | +| `meta.test.ts` | literal-evaluator accepts pure literals, rejects calls/identifiers/spreads | +| `tool.test.ts` | `workflowTool` runs a scripted-adapter-authored workflow end-to-end; returns `{runId,result}` | +| `define.test.ts` | typed `defineWorkflow` runs identically to the equivalent string script | +| `orchestrate.test.ts` (existing) | **must pass unchanged** (backward compat) | + +A Changeset (`pnpm changeset`) describes the new `@flint/landlord` workflow surface. + +--- + +## 12. Docs plan + +- New `docs/landlord/workflow.md` (overview + mental model), `docs/landlord/hooks.md` (full hook reference), `docs/landlord/resume.md`, `docs/landlord/agent-types.md`, `docs/landlord/isolation.md`, `docs/landlord/workflow-tool.md`. +- Update `docs/landlord/index.md` (add the workflow-runtime mental model alongside the tenant model) and `docs/landlord/orchestrate.md` (note it is now runtime-backed; behavior unchanged). +- New example `docs/examples/dynamic-workflow.md` (a review→verify pipeline script, both string and typed). +- VitePress `docs/.vitepress/config.ts` sidebar/nav additions; README `## Packages` / Landlord bullet mention of the workflow runtime. +- All code samples real and runnable against the actual API (project doc norm). + +--- + +## 13. Constraints & non-goals + +- **Backward compatibility:** `orchestrate`/`decompose`/`runTenant`/`Contract` public API and existing tests preserved. +- **No network in tests:** all behavior tested via mock/scripted adapters. +- **Out of scope (harness-coupled, intentionally not ported):** the `/workflows` TUI progress tree (replaced by `onEvent`), background task scheduling + `` (replaced by the async `runWorkflow` + `AbortSignal`), MCP `ToolSearch` deferred-tool loading (callers pass tools/agentTypes explicitly). These are noted in docs as the library's equivalents. +- **Determinism in typed mode** cannot be enforced lexically; documented as a resume precondition rather than sandboxed. +- **Single-level `workflow()` nesting**, matching the product (nested `workflow()` throws). + +--- + +## 14. File-by-file work breakdown (for the implementation plan) + +Independent-ish units, buildable in parallel where noted: + +1. `workflow/types.ts`, `workflow/events.ts`, `workflow/concurrency.ts`, `workflow/budget.ts` — foundation types + primitives (parallel-safe). +2. `workflow/journal.ts`, `workflow/registry.ts`, `workflow/isolation.ts` — stores/registries/backends (parallel-safe, depend on 1). +3. `workflow/schema.ts`, `workflow/agentcall.ts` — the agent() hook + structured output (depends on 1–2). +4. `workflow/hooks.ts`, `workflow/runtime.ts` — context assembly + run engine (depends on 1–3). +5. `workflow/meta.ts`, `workflow/sandbox.ts`, `workflow/script.ts`, `workflow/define.ts` — authoring paths (depends on 4). +6. `workflow/tool.ts` (+ `WORKFLOW_TOOL_GUIDE`, `orchestratorAgent`) — model-facing parity (depends on 5). +7. `orchestrate.ts` rebuild + `index.ts`/`package.json` wiring (depends on 4–5). +8. Tests (§11), docs (§12), Changeset. From 12f3ce1cd5d5daa3352ecc637c9a0a293ec34769 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 15:38:47 -0600 Subject: [PATCH 02/22] docs(landlord): implementation plan for dynamic-workflow runtime 16 TDD tasks building the workflow runtime in @flint/landlord: errors/types, concurrency caps, budget bridge, events, journal/resume, registries, isolation, structured-output schema, the agent() hook, the workflow context, meta parser + determinism sandbox, script/typed authoring, the run engine, workflowTool, the backward-compatible orchestrate() rebuild, and docs/changeset. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../2026-05-31-landlord-dynamic-workflows.md | 2671 +++++++++++++++++ 1 file changed, 2671 insertions(+) create mode 100644 docs/superpowers/plans/2026-05-31-landlord-dynamic-workflows.md diff --git a/docs/superpowers/plans/2026-05-31-landlord-dynamic-workflows.md b/docs/superpowers/plans/2026-05-31-landlord-dynamic-workflows.md new file mode 100644 index 0000000..3c7f063 --- /dev/null +++ b/docs/superpowers/plans/2026-05-31-landlord-dynamic-workflows.md @@ -0,0 +1,2671 @@ +# Landlord Dynamic Workflows Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Port Claude Code's "ultracode" / dynamic-workflow runtime into `@flint/landlord` — a script-driven workflow engine that injects the same hooks (`agent`, `parallel`, `pipeline`, `phase`, `log`, `args`, `budget`, `workflow`) on top of Flint primitives, with structured-output schemas, concurrency/agent caps, resume/journaling, a determinism sandbox, an agentType registry, isolation backends, and a model-facing `workflowTool`. + +**Architecture:** A new `packages/landlord/src/workflow/` subtree becomes the package core. The run engine (`runtime.ts`) owns per-run state (semaphore, agent counter, budgets, journal, event emitter, runId) and executes a `WorkflowModule` (`{ meta, run }`). Two authoring paths produce a module: `defineWorkflow()` (typed) and `runWorkflowScript()` (a JS string compiled in a determinism sandbox). The same hooks reach the script as injected globals and the typed function as a `wf` context. `orchestrate()` is rebuilt as a built-in auto-decompose workflow on the runtime while preserving its public API and existing tests. + +**Tech Stack:** TypeScript 5.7 (ESM, `strict`, `noUncheckedIndexedAccess`, `exactOptionalPropertyTypes`, `verbatimModuleSyntax`), Flint core (`flint`, `flint/budget`, `flint/errors`, `flint/testing`), `ajv` (schema validation), `zod` (tool input), `vitest`, Biome (single quotes, semicolons, trailing commas, `useImportType`, no default exports), `tsup` build. + +**Conventions every task follows:** +- All relative imports use the `.ts` extension (e.g. `from './errors.ts'`) — required by `allowImportingTsExtensions`. +- Use `import type` for type-only imports (Biome `useImportType` is an error). +- No default exports (Biome error) — named factory functions/classes only. +- For optional object properties assigned conditionally, use the spread guard pattern `...(x !== undefined ? { x } : {})` (required by `exactOptionalPropertyTypes`). +- Working directory for all commands: `packages/landlord/`. Run a single test file with `pnpm vitest run test/workflow/.test.ts`. +- Commit after each task with the message shown. + +--- + +## File Structure + +| File | Responsibility | +|------|----------------| +| `src/workflow/errors.ts` | `WorkflowError`, `AgentCapError`, `MetaError` (extend `FlintError`) | +| `src/workflow/types.ts` | Pure shared types: `AgentOpts`, `StageFn`, `WorkflowContext`, `WorkflowEvent`, `Meta`, `WorkflowModule`, `Models`, `WorkflowBudgetView`, `WorkflowRunResult` | +| `src/workflow/concurrency.ts` | `defaultConcurrency()`, `Semaphore`, `AgentCounter` | +| `src/workflow/budget.ts` | `WorkflowBudget` (token-target tracker) + `budgetView()` | +| `src/workflow/events.ts` | `EventEmitter`, `EventSink` | +| `src/workflow/journal.ts` | `JournalEntry`, `JournalStore`, `memoryJournalStore()`, `fileJournalStore()`, `hashCall()` | +| `src/workflow/registry.ts` | `AgentType`, `AgentTypeRegistry`, `createAgentRegistry()`, `WorkflowRegistry`, `createWorkflowRegistry()` | +| `src/workflow/isolation.ts` | `IsolationBackend`, `IsolationLease`, `workdirIsolation()`, `gitWorktreeIsolation()` | +| `src/workflow/schema.ts` | `makeStructuredOutput()` — forced structured-output tool + ajv validation | +| `src/workflow/agentcall.ts` | `RunDeps`, `runAgentCall()` — the `agent()` hook | +| `src/workflow/hooks.ts` | `buildContext()` — assembles the `WorkflowContext` (parallel/pipeline/phase/log/budget/workflow) | +| `src/workflow/runtime.ts` | `RuntimeConfig`, `runWorkflow()`, `runWorkflowScript()` — the run engine | +| `src/workflow/meta.ts` | `parseMeta()`, `parseLiteral()` — restricted JS-literal parser | +| `src/workflow/sandbox.ts` | `sandboxBindings()` — throwing stubs for Date/Math.random/process/etc. | +| `src/workflow/script.ts` | `compileScript()`, `stripModuleSyntax()` | +| `src/workflow/define.ts` | `defineWorkflow()` | +| `src/workflow/tool.ts` | `workflowTool()`, `WORKFLOW_TOOL_GUIDE`, `orchestratorAgent()` | +| `src/workflow/index.ts` | Re-exports of the workflow surface | +| `src/orchestrate.ts` | Rebuilt: auto-decompose workflow on the runtime (public API preserved) | +| `src/index.ts` | Adds runtime headline exports | +| `package.json`, `tsup.config.ts` | Add `./workflow` export + build entry | + +--- + +## Task 1: Workflow errors + shared types + +**Files:** +- Create: `packages/landlord/src/workflow/errors.ts` +- Create: `packages/landlord/src/workflow/types.ts` +- Test: `packages/landlord/test/workflow/types.test.ts` + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/types.test.ts +import { describe, expect, it } from 'vitest'; +import { AgentCapError, MetaError, WorkflowError } from '../../src/workflow/errors.ts'; + +describe('workflow errors', () => { + it('WorkflowError carries a code and name', () => { + const e = new WorkflowError('boom', 'workflow.test'); + expect(e.code).toBe('workflow.test'); + expect(e.name).toBe('WorkflowError'); + expect(e).toBeInstanceOf(Error); + }); + + it('AgentCapError and MetaError have fixed codes', () => { + expect(new AgentCapError('x').code).toBe('workflow.agent_cap'); + expect(new MetaError('x').code).toBe('workflow.meta'); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/types.test.ts` +Expected: FAIL — cannot find module `../../src/workflow/errors.ts`. + +- [ ] **Step 3: Write `errors.ts`** + +```ts +// src/workflow/errors.ts +import { FlintError } from 'flint/errors'; + +export class WorkflowError extends FlintError { + constructor(message: string, code: string, cause?: unknown) { + super(message, { code, ...(cause !== undefined ? { cause } : {}) }); + this.name = 'WorkflowError'; + } +} + +export class AgentCapError extends WorkflowError { + constructor(message: string) { + super(message, 'workflow.agent_cap'); + this.name = 'AgentCapError'; + } +} + +export class MetaError extends WorkflowError { + constructor(message: string) { + super(message, 'workflow.meta'); + this.name = 'MetaError'; + } +} +``` + +- [ ] **Step 4: Write `types.ts`** + +```ts +// src/workflow/types.ts +import type { ProviderAdapter, Tool } from 'flint'; + +export type Models = { default: string } & Record; + +export type AgentOpts = { + label?: string; + phase?: string; + schema?: Record; + model?: string; + isolation?: 'worktree'; + agentType?: string; +}; + +export type StageFn = ( + prev: unknown, + originalItem: unknown, + index: number, +) => unknown | Promise; + +export type WorkflowBudgetView = { + total: number | null; + spent: () => number; + remaining: () => number; +}; + +export type WorkflowContext = { + agent: (prompt: string, opts?: AgentOpts) => Promise; + parallel: (thunks: Array<() => Promise>) => Promise>; + pipeline: (items: unknown[], ...stages: StageFn[]) => Promise; + phase: (title: string) => void; + log: (message: string) => void; + args: unknown; + budget: WorkflowBudgetView; + workflow: ( + ref: string | { scriptPath?: string; source?: string }, + args?: unknown, + ) => Promise; +}; + +export type WorkflowEvent = + | { type: 'phase_started'; title: string } + | { type: 'log'; message: string } + | { type: 'agent_started'; label: string; phase?: string; agentType: string; model: string } + | { type: 'agent_complete'; label: string; phase?: string; tokens: number } + | { type: 'agent_error'; label: string; phase?: string; error: string } + | { type: 'workflow_complete'; result: unknown }; + +export type MetaPhase = { title: string; detail?: string; model?: string }; + +export type Meta = { + name: string; + description: string; + whenToUse?: string; + model?: string; + phases?: MetaPhase[]; +}; + +export type WorkflowModule = { + meta: Meta; + run: (wf: WorkflowContext) => Promise; +}; + +export type WorkflowRunResult = { + runId: string; + result: unknown; + events: WorkflowEvent[]; +}; + +// Re-exported here so consumers can build tool registries without importing flint directly. +export type { ProviderAdapter, Tool }; +``` + +- [ ] **Step 5: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/types.test.ts && pnpm typecheck` +Expected: PASS; typecheck clean. + +- [ ] **Step 6: Commit** + +```bash +git add packages/landlord/src/workflow/errors.ts packages/landlord/src/workflow/types.ts packages/landlord/test/workflow/types.test.ts +git commit -m "feat(landlord): workflow errors and shared types" +``` + +--- + +## Task 2: Concurrency — Semaphore + agent cap + +**Files:** +- Create: `packages/landlord/src/workflow/concurrency.ts` +- Test: `packages/landlord/test/workflow/concurrency.test.ts` + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/concurrency.test.ts +import { describe, expect, it } from 'vitest'; +import { AgentCapError } from '../../src/workflow/errors.ts'; +import { AgentCounter, Semaphore, defaultConcurrency } from '../../src/workflow/concurrency.ts'; + +describe('Semaphore', () => { + it('never runs more than `limit` tasks concurrently', async () => { + const sem = new Semaphore(2); + let active = 0; + let peak = 0; + const task = async () => { + const release = await sem.acquire(); + active++; + peak = Math.max(peak, active); + await new Promise((r) => setTimeout(r, 5)); + active--; + release(); + }; + await Promise.all(Array.from({ length: 8 }, () => task())); + expect(peak).toBeLessThanOrEqual(2); + }); +}); + +describe('AgentCounter', () => { + it('throws AgentCapError past the cap', () => { + const c = new AgentCounter(3); + c.increment(); + c.increment(); + c.increment(); + expect(() => c.increment()).toThrow(AgentCapError); + }); +}); + +describe('defaultConcurrency', () => { + it('is at least 1 and at most 16', () => { + const n = defaultConcurrency(); + expect(n).toBeGreaterThanOrEqual(1); + expect(n).toBeLessThanOrEqual(16); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/concurrency.test.ts` +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `concurrency.ts`** + +```ts +// src/workflow/concurrency.ts +import { cpus } from 'node:os'; +import { AgentCapError } from './errors.ts'; + +export function defaultConcurrency(): number { + return Math.max(1, Math.min(16, cpus().length - 2)); +} + +/** + * Counting semaphore. The fast path (slot available) runs synchronously up to + * `active++`, so concurrent synchronous `acquire()` calls cannot oversubscribe. + */ +export class Semaphore { + private active = 0; + private readonly waiters: Array<() => void> = []; + + constructor(private readonly limit: number) {} + + async acquire(): Promise<() => void> { + if (this.active >= this.limit) { + await new Promise((resolve) => this.waiters.push(resolve)); + } + this.active++; + let released = false; + return () => { + if (released) return; + released = true; + this.active--; + const next = this.waiters.shift(); + if (next) next(); + }; + } +} + +export class AgentCounter { + private count = 0; + + constructor(private readonly cap: number = 1000) {} + + increment(): void { + this.count += 1; + if (this.count > this.cap) { + throw new AgentCapError(`Workflow exceeded the ${this.cap}-agent cap`); + } + } + + get value(): number { + return this.count; + } +} +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/concurrency.test.ts` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add packages/landlord/src/workflow/concurrency.ts packages/landlord/test/workflow/concurrency.test.ts +git commit -m "feat(landlord): concurrency semaphore and agent cap" +``` + +--- + +## Task 3: Workflow budget + +**Files:** +- Create: `packages/landlord/src/workflow/budget.ts` +- Test: `packages/landlord/test/workflow/budget.test.ts` + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/budget.test.ts +import { describe, expect, it } from 'vitest'; +import { WorkflowBudget, budgetView } from '../../src/workflow/budget.ts'; + +describe('WorkflowBudget', () => { + it('tracks spent output tokens and computes remaining against a target', () => { + const wb = new WorkflowBudget(100); + wb.record({ input: 10, output: 30 }); + wb.record({ input: 5, output: 20 }); + expect(wb.spent()).toBe(50); + expect(wb.remaining()).toBe(50); + }); + + it('remaining is Infinity when total is null', () => { + const wb = new WorkflowBudget(null); + wb.record({ output: 1000 }); + expect(wb.spent()).toBe(1000); + expect(wb.remaining()).toBe(Number.POSITIVE_INFINITY); + }); + + it('budgetView exposes total/spent/remaining bound to the instance', () => { + const wb = new WorkflowBudget(10); + const view = budgetView(wb); + wb.record({ output: 4 }); + expect(view.total).toBe(10); + expect(view.spent()).toBe(4); + expect(view.remaining()).toBe(6); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/budget.test.ts` +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `budget.ts`** + +```ts +// src/workflow/budget.ts +import type { WorkflowBudgetView } from './types.ts'; + +/** + * Tracks the run's output-token spend against an optional target (the ultracode + * "+500k"-style ceiling). `total === null` means no target → unbounded remaining. + */ +export class WorkflowBudget { + private outputTokens = 0; + readonly total: number | null; + + constructor(total: number | null) { + this.total = total; + } + + record(usage: { input?: number; output?: number; cached?: number }): void { + this.outputTokens += usage.output ?? 0; + } + + spent(): number { + return this.outputTokens; + } + + remaining(): number { + return this.total === null + ? Number.POSITIVE_INFINITY + : Math.max(0, this.total - this.outputTokens); + } +} + +export function budgetView(wb: WorkflowBudget): WorkflowBudgetView { + return { + total: wb.total, + spent: () => wb.spent(), + remaining: () => wb.remaining(), + }; +} +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/budget.test.ts` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add packages/landlord/src/workflow/budget.ts packages/landlord/test/workflow/budget.test.ts +git commit -m "feat(landlord): workflow budget token-target tracker" +``` + +--- + +## Task 4: Event emitter + +**Files:** +- Create: `packages/landlord/src/workflow/events.ts` +- Test: `packages/landlord/test/workflow/events.test.ts` + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/events.test.ts +import { describe, expect, it } from 'vitest'; +import { EventEmitter } from '../../src/workflow/events.ts'; +import type { WorkflowEvent } from '../../src/workflow/types.ts'; + +describe('EventEmitter', () => { + it('records events and forwards them to the sink', () => { + const seen: WorkflowEvent[] = []; + const em = new EventEmitter((e) => seen.push(e)); + em.emit({ type: 'log', message: 'hi' }); + em.emit({ type: 'phase_started', title: 'Find' }); + expect(seen).toHaveLength(2); + expect(em.all().map((e) => e.type)).toEqual(['log', 'phase_started']); + }); + + it('works with no sink', () => { + const em = new EventEmitter(); + em.emit({ type: 'log', message: 'x' }); + expect(em.all()).toHaveLength(1); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/events.test.ts` +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `events.ts`** + +```ts +// src/workflow/events.ts +import type { WorkflowEvent } from './types.ts'; + +export type EventSink = (event: WorkflowEvent) => void; + +export class EventEmitter { + private readonly events: WorkflowEvent[] = []; + + constructor(private readonly sink?: EventSink) {} + + emit(event: WorkflowEvent): void { + this.events.push(event); + this.sink?.(event); + } + + all(): WorkflowEvent[] { + return this.events; + } +} +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/events.test.ts` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add packages/landlord/src/workflow/events.ts packages/landlord/test/workflow/events.test.ts +git commit -m "feat(landlord): workflow event emitter" +``` + +--- + +## Task 5: Journal (resume/replay) + +**Files:** +- Create: `packages/landlord/src/workflow/journal.ts` +- Test: `packages/landlord/test/workflow/journal.test.ts` + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/journal.test.ts +import { mkdtemp } from 'node:fs/promises'; +import { tmpdir } from 'node:os'; +import { join } from 'node:path'; +import { describe, expect, it } from 'vitest'; +import { fileJournalStore, hashCall, memoryJournalStore } from '../../src/workflow/journal.ts'; + +describe('hashCall', () => { + it('is stable regardless of opts key order', () => { + const a = hashCall('p', { label: 'x', phase: 'y' }); + const b = hashCall('p', { phase: 'y', label: 'x' }); + expect(a).toBe(b); + }); + it('changes when the prompt changes', () => { + expect(hashCall('a', {})).not.toBe(hashCall('b', {})); + }); +}); + +describe('memoryJournalStore', () => { + it('appends and loads entries in order', async () => { + const s = memoryJournalStore(); + await s.append('run1', { index: 0, hash: 'h0', result: 'r0' }); + await s.append('run1', { index: 1, hash: 'h1', result: 'r1' }); + const entries = await s.load('run1'); + expect(entries.map((e) => e.result)).toEqual(['r0', 'r1']); + }); +}); + +describe('fileJournalStore', () => { + it('round-trips entries through JSONL', async () => { + const dir = await mkdtemp(join(tmpdir(), 'jrnl-')); + const s = fileJournalStore(dir); + await s.append('runA', { index: 0, hash: 'h', result: { ok: true } }); + const entries = await s.load('runA'); + expect(entries).toEqual([{ index: 0, hash: 'h', result: { ok: true } }]); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/journal.test.ts` +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `journal.ts`** + +```ts +// src/workflow/journal.ts +import { appendFile, mkdir, readFile } from 'node:fs/promises'; +import { join } from 'node:path'; + +export type JournalEntry = { index: number; hash: string; result: unknown }; + +export interface JournalStore { + append(runId: string, entry: JournalEntry): Promise; + load(runId: string): Promise; +} + +function stableStringify(value: unknown): string { + if (value === null || typeof value !== 'object') return JSON.stringify(value) ?? 'null'; + if (Array.isArray(value)) return `[${value.map(stableStringify).join(',')}]`; + const keys = Object.keys(value as Record).sort(); + const body = keys + .map((k) => `${JSON.stringify(k)}:${stableStringify((value as Record)[k])}`) + .join(','); + return `{${body}}`; +} + +/** FNV-1a (32-bit) hex of the stable-stringified call signature. */ +export function hashCall(prompt: string, opts: unknown): string { + const input = stableStringify({ prompt, opts: opts ?? {} }); + let h = 0x811c9dc5; + for (let i = 0; i < input.length; i++) { + h ^= input.charCodeAt(i); + h = Math.imul(h, 0x01000193); + } + return (h >>> 0).toString(16).padStart(8, '0'); +} + +export function memoryJournalStore(): JournalStore { + const runs = new Map(); + return { + async append(runId, entry) { + const list = runs.get(runId) ?? []; + list.push(entry); + runs.set(runId, list); + }, + async load(runId) { + return [...(runs.get(runId) ?? [])]; + }, + }; +} + +export function fileJournalStore(dir: string): JournalStore { + const path = (runId: string) => join(dir, `journal-${runId}.jsonl`); + return { + async append(runId, entry) { + await mkdir(dir, { recursive: true }); + await appendFile(path(runId), `${JSON.stringify(entry)}\n`, 'utf-8'); + }, + async load(runId) { + let text: string; + try { + text = await readFile(path(runId), 'utf-8'); + } catch { + return []; + } + return text + .split('\n') + .filter((l) => l.trim().length > 0) + .map((l) => JSON.parse(l) as JournalEntry); + }, + }; +} +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/journal.test.ts` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add packages/landlord/src/workflow/journal.ts packages/landlord/test/workflow/journal.test.ts +git commit -m "feat(landlord): journal store and call hashing for resume" +``` + +--- + +## Task 6: Registries (agent types + named workflows) + +**Files:** +- Create: `packages/landlord/src/workflow/registry.ts` +- Test: `packages/landlord/test/workflow/registry.test.ts` + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/registry.test.ts +import { describe, expect, it } from 'vitest'; +import { WorkflowError } from '../../src/workflow/errors.ts'; +import { + createAgentRegistry, + createWorkflowRegistry, +} from '../../src/workflow/registry.ts'; + +describe('createAgentRegistry', () => { + it('resolves built-in types', () => { + const reg = createAgentRegistry(); + expect(reg.has('default')).toBe(true); + expect(reg.has('Explore')).toBe(true); + expect(reg.has('code-reviewer')).toBe(true); + expect(reg.resolve('default').tools?.('/tmp/x').length).toBeGreaterThan(0); + }); + + it('merges custom types over built-ins and throws on unknown', () => { + const reg = createAgentRegistry({ custom: { systemPrompt: 'You are custom.' } }); + expect(reg.resolve('custom').systemPrompt).toBe('You are custom.'); + expect(() => reg.resolve('missing')).toThrow(WorkflowError); + }); +}); + +describe('createWorkflowRegistry', () => { + it('resolves named sources', () => { + const reg = createWorkflowRegistry({ greet: 'return "hi"' }); + expect(reg.resolve('greet')).toBe('return "hi"'); + expect(reg.resolve('nope')).toBeUndefined(); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/registry.test.ts` +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `registry.ts`** + +```ts +// src/workflow/registry.ts +import type { Tool } from 'flint'; +import { WorkflowError } from './errors.ts'; +import { bashTool, fileReadTool, webFetchTool } from '../tools/index.ts'; +import { standardTools } from '../tools/index.ts'; + +export type AgentType = { + systemPrompt: string; + tools?: (workDir: string) => Tool[]; + model?: string; +}; + +export type AgentTypeRegistry = { + resolve(name: string): AgentType; + has(name: string): boolean; +}; + +export const BUILT_IN_AGENT_TYPES: Record = { + default: { + systemPrompt: + 'You are a focused worker agent. Use your tools to accomplish the task. ' + + 'When a structured result is requested, return it by calling the structured_output tool.', + tools: (workDir) => standardTools(workDir), + }, + Explore: { + systemPrompt: + 'You are a read-only exploration agent. Search broadly, read excerpts rather than whole ' + + 'files, and return conclusions — never modify anything. You have read and web tools only.', + tools: (workDir) => [fileReadTool(workDir), webFetchTool(workDir)], + }, + 'code-reviewer': { + systemPrompt: + 'You are a code reviewer. Read the relevant code and report concrete issues (bugs, security, ' + + 'quality) with file and line references. Return findings via structured_output when asked.', + tools: (workDir) => [fileReadTool(workDir), bashTool(workDir)], + }, +}; + +export function createAgentRegistry(custom?: Record): AgentTypeRegistry { + const merged: Record = { ...BUILT_IN_AGENT_TYPES, ...(custom ?? {}) }; + return { + has: (name) => name in merged, + resolve: (name) => { + const t = merged[name]; + if (t === undefined) { + throw new WorkflowError( + `Unknown agentType '${name}'. Known: ${Object.keys(merged).join(', ')}`, + 'workflow.unknown_agent_type', + ); + } + return t; + }, + }; +} + +export type WorkflowRegistry = { + resolve(name: string): string | undefined; +}; + +export function createWorkflowRegistry(scripts: Record): WorkflowRegistry { + return { resolve: (name) => scripts[name] }; +} +``` + +> Note: `src/tools/index.ts` already exports `standardTools`, `fileReadTool`, `webFetchTool`, and `bashTool`. The two `import` lines are kept separate only for clarity; Biome's `organizeImports` will merge them on format — run `pnpm format` before committing if it complains. + +- [ ] **Step 4: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/registry.test.ts && pnpm format` +Expected: PASS; format clean. + +- [ ] **Step 5: Commit** + +```bash +git add packages/landlord/src/workflow/registry.ts packages/landlord/test/workflow/registry.test.ts +git commit -m "feat(landlord): agent-type and workflow registries" +``` + +--- + +## Task 7: Isolation backends + +**Files:** +- Create: `packages/landlord/src/workflow/isolation.ts` +- Test: `packages/landlord/test/workflow/isolation.test.ts` + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/isolation.test.ts +import { mkdtemp, stat } from 'node:fs/promises'; +import { tmpdir } from 'node:os'; +import { join } from 'node:path'; +import { describe, expect, it } from 'vitest'; +import { gitWorktreeIsolation, workdirIsolation } from '../../src/workflow/isolation.ts'; + +describe('workdirIsolation', () => { + it('creates a distinct existing directory per acquire', async () => { + const base = await mkdtemp(join(tmpdir(), 'iso-')); + const backend = workdirIsolation(base); + const a = await backend.acquire('alpha'); + const b = await backend.acquire('alpha'); + expect(a.workDir).not.toBe(b.workDir); + expect((await stat(a.workDir)).isDirectory()).toBe(true); + await a.release(); + await b.release(); + }); +}); + +describe('gitWorktreeIsolation', () => { + it('falls back to a workdir lease outside a git repo', async () => { + const base = await mkdtemp(join(tmpdir(), 'iso2-')); + const notRepo = await mkdtemp(join(tmpdir(), 'norepo-')); + const backend = gitWorktreeIsolation(notRepo, base); + const lease = await backend.acquire('w'); + expect((await stat(lease.workDir)).isDirectory()).toBe(true); + await lease.release(); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/isolation.test.ts` +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `isolation.ts`** + +```ts +// src/workflow/isolation.ts +import { exec } from 'node:child_process'; +import { mkdir } from 'node:fs/promises'; +import { join } from 'node:path'; +import { promisify } from 'node:util'; + +const execAsync = promisify(exec); + +export type IsolationLease = { workDir: string; release: () => Promise }; + +export interface IsolationBackend { + acquire(label: string): Promise; +} + +function sanitize(label: string): string { + return label.replace(/[^a-zA-Z0-9_-]/g, '_').slice(0, 40) || 'agent'; +} + +export function workdirIsolation(baseDir: string): IsolationBackend { + let counter = 0; + return { + async acquire(label) { + const workDir = join(baseDir, `${sanitize(label)}-${counter++}`); + await mkdir(workDir, { recursive: true }); + return { workDir, release: async () => {} }; + }, + }; +} + +export function gitWorktreeIsolation(repoDir: string, baseDir: string): IsolationBackend { + const fallback = workdirIsolation(baseDir); + let counter = 0; + return { + async acquire(label) { + try { + await execAsync('git rev-parse --is-inside-work-tree', { cwd: repoDir }); + } catch { + return fallback.acquire(label); + } + const workDir = join(baseDir, `wt-${sanitize(label)}-${counter++}`); + try { + await execAsync(`git worktree add --detach ${JSON.stringify(workDir)}`, { cwd: repoDir }); + } catch { + return fallback.acquire(label); + } + return { + workDir, + release: async () => { + try { + await execAsync(`git worktree remove --force ${JSON.stringify(workDir)}`, { + cwd: repoDir, + }); + } catch { + /* leave the worktree for inspection if removal fails */ + } + }, + }; + }, + }; +} +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/isolation.test.ts` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add packages/landlord/src/workflow/isolation.ts packages/landlord/test/workflow/isolation.test.ts +git commit -m "feat(landlord): workdir and git-worktree isolation backends" +``` + +--- + +## Task 8: Structured output (schema → forced tool) + +**Files:** +- Create: `packages/landlord/src/workflow/schema.ts` +- Test: `packages/landlord/test/workflow/schema.test.ts` + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/schema.test.ts +import { execute } from 'flint'; +import { describe, expect, it } from 'vitest'; +import { makeStructuredOutput } from '../../src/workflow/schema.ts'; + +describe('makeStructuredOutput', () => { + it('captures a valid object and reports success', async () => { + const so = makeStructuredOutput({ + type: 'object', + properties: { name: { type: 'string' } }, + required: ['name'], + }); + const res = await execute(so.tool, { name: 'ada' }); + expect(res.ok).toBe(true); + expect(so.getValue()).toEqual({ name: 'ada' }); + }); + + it('rejects an invalid object and leaves value undefined', async () => { + const so = makeStructuredOutput({ + type: 'object', + properties: { n: { type: 'number' } }, + required: ['n'], + }); + const res = await execute(so.tool, { n: 'not-a-number' }); + expect(res.ok).toBe(true); // handler returns an error string, not a thrown error + expect(String(res.ok ? res.value : '')).toMatch(/does not match/i); + expect(so.getValue()).toBeUndefined(); + }); + + it('wraps non-object schemas under a result key and unwraps the captured value', async () => { + const so = makeStructuredOutput({ type: 'array', items: { type: 'string' } }); + await execute(so.tool, { result: ['a', 'b'] }); + expect(so.getValue()).toEqual(['a', 'b']); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/schema.test.ts` +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `schema.ts`** + +```ts +// src/workflow/schema.ts +import Ajv from 'ajv'; +import { tool } from 'flint'; +import type { StandardSchemaV1, Tool } from 'flint'; + +const ajv = new Ajv({ allErrors: true }); + +function anyObjectSchema(): StandardSchemaV1> { + return { + '~standard': { + version: 1, + vendor: 'landlord', + validate: (v) => { + if (typeof v !== 'object' || v === null || Array.isArray(v)) { + return { issues: [{ message: 'Expected an object' }] }; + } + return { value: v as Record }; + }, + }, + }; +} + +export type StructuredOutput = { + tool: Tool; + getValue: () => unknown; +}; + +/** + * Build a forced `structured_output` tool for an `agent()` call. Object schemas + * are presented as-is; non-object schemas are wrapped under a `result` key and + * unwrapped on capture. The handler validates with ajv and returns a corrective + * message on mismatch so the agent loop retries. + */ +export function makeStructuredOutput(schema: Record): StructuredOutput { + const wrapped = schema['type'] !== 'object'; + const jsonSchema: Record = wrapped + ? { type: 'object', properties: { result: schema }, required: ['result'] } + : schema; + + let validate: ReturnType; + try { + validate = ajv.compile(jsonSchema); + } catch { + validate = ajv.compile({ type: 'object' }); + } + + let captured: unknown; + let done = false; + + const t = tool({ + name: 'structured_output', + description: + 'Return your final result as JSON matching the required schema. Call this exactly once.', + input: anyObjectSchema(), + jsonSchema, + handler: (input: Record) => { + if (!validate(input)) { + return `Output does not match the required schema: ${ajv.errorsText(validate.errors)}. Call structured_output again with corrected fields.`; + } + if (!done) { + captured = wrapped ? (input as { result: unknown }).result : input; + done = true; + } + return 'Accepted.'; + }, + }) as unknown as Tool; + + return { tool: t, getValue: () => captured }; +} +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/schema.test.ts` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add packages/landlord/src/workflow/schema.ts packages/landlord/test/workflow/schema.test.ts +git commit -m "feat(landlord): structured-output tool with ajv validation" +``` + +## Task 9: The `agent()` hook + +**Files:** +- Create: `packages/landlord/src/workflow/agentcall.ts` +- Test: `packages/landlord/test/workflow/agentcall.test.ts` + +**Context:** `RunDeps` is the per-run state object threaded into both `runAgentCall` (here) and `buildContext` (Task 10). It is defined here because this is the first consumer. `runtime.ts` (Task 13) constructs it. + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/agentcall.test.ts +import type { NormalizedResponse } from 'flint'; +import { budget as makeBudget } from 'flint/budget'; +import { mkdtemp } from 'node:fs/promises'; +import { tmpdir } from 'node:os'; +import { join } from 'node:path'; +import { mockAdapter, scriptedAdapter } from 'flint/testing'; +import { describe, expect, it } from 'vitest'; +import { WorkflowBudget } from '../../src/workflow/budget.ts'; +import { AgentCounter, Semaphore } from '../../src/workflow/concurrency.ts'; +import { EventEmitter } from '../../src/workflow/events.ts'; +import { memoryJournalStore } from '../../src/workflow/journal.ts'; +import { createAgentRegistry } from '../../src/workflow/registry.ts'; +import { workdirIsolation } from '../../src/workflow/isolation.ts'; +import { runAgentCall } from '../../src/workflow/agentcall.ts'; +import type { RunDeps } from '../../src/workflow/agentcall.ts'; + +function textResponse(content: string): NormalizedResponse { + return { message: { role: 'assistant', content }, usage: { input: 10, output: 5 }, stopReason: 'end' }; +} +function toolCallResponse(name: string, args: unknown): NormalizedResponse { + return { + message: { role: 'assistant', content: '', toolCalls: [{ id: 'tc1', name, arguments: args }] }, + usage: { input: 20, output: 10 }, + stopReason: 'tool_call', + }; +} + +async function makeDeps(adapter: RunDeps['adapter']): Promise { + const base = await mkdtemp(join(tmpdir(), 'ac-')); + let index = 0; + return { + adapter, + models: { default: 'test' }, + flintBudget: makeBudget({ maxSteps: 50 }), + wfBudget: new WorkflowBudget(null), + semaphore: new Semaphore(4), + counter: new AgentCounter(1000), + registry: createAgentRegistry(), + workflows: undefined, + isolation: workdirIsolation(base), + worktreeIsolation: undefined, + emitter: new EventEmitter(), + journal: memoryJournalStore(), + runId: 'run-test', + resumeEntries: [], + signal: undefined, + args: undefined, + depth: 0, + nextIndex: () => index++, + currentPhase: { value: undefined }, + }; +} + +describe('runAgentCall', () => { + it('returns the final text for a no-schema call and records the journal', async () => { + const deps = await makeDeps(scriptedAdapter([textResponse('hello world')])); + const result = await runAgentCall('say hi', undefined, deps); + expect(result).toBe('hello world'); + expect(await deps.journal.load('run-test')).toHaveLength(1); + expect(deps.emitter.all().map((e) => e.type)).toEqual(['agent_started', 'agent_complete']); + }); + + it('returns the validated object for a schema call', async () => { + const deps = await makeDeps( + scriptedAdapter([toolCallResponse('structured_output', { name: 'ada' }), textResponse('done')]), + ); + const result = await runAgentCall('produce', { schema: { + type: 'object', properties: { name: { type: 'string' } }, required: ['name'], + } }, deps); + expect(result).toEqual({ name: 'ada' }); + }); + + it('replays a cached result on resume without calling the adapter', async () => { + const throwingAdapter = mockAdapter({ onCall: () => { throw new Error('must not call'); } }); + const deps = await makeDeps(throwingAdapter); + // Pre-seed resume entry: index 0 with the matching hash for ('say hi', {}). + const { hashCall } = await import('../../src/workflow/journal.ts'); + deps.resumeEntries = [{ index: 0, hash: hashCall('say hi', {}), result: 'cached!' }]; + const result = await runAgentCall('say hi', undefined, deps); + expect(result).toBe('cached!'); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/agentcall.test.ts` +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `agentcall.ts`** + +```ts +// src/workflow/agentcall.ts +import { agent } from 'flint'; +import type { ProviderAdapter } from 'flint'; +import type { Budget } from 'flint/budget'; +import { standardTools } from '../tools/index.ts'; +import type { AgentCounter, Semaphore } from './concurrency.ts'; +import type { WorkflowBudget } from './budget.ts'; +import { WorkflowError } from './errors.ts'; +import type { EventEmitter } from './events.ts'; +import { hashCall } from './journal.ts'; +import type { JournalEntry, JournalStore } from './journal.ts'; +import type { IsolationBackend } from './isolation.ts'; +import type { AgentTypeRegistry, WorkflowRegistry } from './registry.ts'; +import { makeStructuredOutput } from './schema.ts'; +import type { AgentOpts, Models } from './types.ts'; + +export type RunDeps = { + adapter: ProviderAdapter; + models: Models; + flintBudget: Budget; + wfBudget: WorkflowBudget; + semaphore: Semaphore; + counter: AgentCounter; + registry: AgentTypeRegistry; + workflows: WorkflowRegistry | undefined; + isolation: IsolationBackend; + worktreeIsolation: IsolationBackend | undefined; + emitter: EventEmitter; + journal: JournalStore; + runId: string; + resumeEntries: JournalEntry[]; + signal: AbortSignal | undefined; + args: unknown; + depth: number; + nextIndex: () => number; + currentPhase: { value: string | undefined }; +}; + +export function deriveLabel(prompt: string): string { + const firstLine = prompt.split('\n', 1)[0] ?? prompt; + return firstLine.length > 48 ? `${firstLine.slice(0, 48)}…` : firstLine; +} + +export async function runAgentCall( + prompt: string, + opts: AgentOpts | undefined, + deps: RunDeps, +): Promise { + const index = deps.nextIndex(); + const hash = hashCall(prompt, opts ?? {}); + + // Resume: replay a cached result when this call's signature is unchanged. + const cached = deps.resumeEntries.find((e) => e.index === index); + if (cached !== undefined && cached.hash === hash) { + return cached.result; + } + + // Token-target ceiling. + if (deps.wfBudget.total !== null && deps.wfBudget.remaining() <= 0) { + throw new WorkflowError(`Workflow token target (${deps.wfBudget.total}) reached`, 'workflow.budget'); + } + + deps.counter.increment(); + const release = await deps.semaphore.acquire(); + + const label = opts?.label ?? deriveLabel(prompt); + const phase = opts?.phase; + const agentType = opts?.agentType ?? 'default'; + const preset = deps.registry.resolve(agentType); + const model = opts?.model ?? preset.model ?? deps.models.default; + const backend = + opts?.isolation === 'worktree' && deps.worktreeIsolation !== undefined + ? deps.worktreeIsolation + : deps.isolation; + const lease = await backend.acquire(label); + + deps.emitter.emit({ + type: 'agent_started', + label, + ...(phase !== undefined ? { phase } : {}), + agentType, + model, + }); + + try { + const baseTools = preset.tools ? preset.tools(lease.workDir) : standardTools(lease.workDir); + let result: unknown; + let tokens = 0; + + if (opts?.schema !== undefined) { + const so = makeStructuredOutput(opts.schema); + const systemPrompt = + `${preset.systemPrompt}\n\nYou MUST call the structured_output tool exactly once with your ` + + 'final result as JSON matching the required schema. Do not finish until you have called it.'; + const out = await agent({ + adapter: deps.adapter, + model, + messages: [ + { role: 'system', content: systemPrompt }, + { role: 'user', content: prompt }, + ], + tools: [so.tool, ...baseTools], + budget: deps.flintBudget, + ...(deps.signal !== undefined ? { signal: deps.signal } : {}), + }); + if (!out.ok) throw out.error; + deps.wfBudget.record(out.value.usage); + tokens = out.value.usage.input + out.value.usage.output; + const value = so.getValue(); + if (value === undefined) { + throw new WorkflowError( + `Agent '${label}' finished without producing structured output`, + 'workflow.no_output', + ); + } + result = value; + } else { + const out = await agent({ + adapter: deps.adapter, + model, + messages: [ + { role: 'system', content: preset.systemPrompt }, + { role: 'user', content: prompt }, + ], + tools: baseTools, + budget: deps.flintBudget, + ...(deps.signal !== undefined ? { signal: deps.signal } : {}), + }); + if (!out.ok) throw out.error; + deps.wfBudget.record(out.value.usage); + tokens = out.value.usage.input + out.value.usage.output; + result = out.value.message.content; + } + + deps.emitter.emit({ + type: 'agent_complete', + label, + ...(phase !== undefined ? { phase } : {}), + tokens, + }); + await deps.journal.append(deps.runId, { index, hash, result }); + return result; + } catch (e) { + deps.emitter.emit({ + type: 'agent_error', + label, + ...(phase !== undefined ? { phase } : {}), + error: e instanceof Error ? e.message : String(e), + }); + throw e; + } finally { + await lease.release(); + release(); + } +} +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/agentcall.test.ts` +Expected: PASS (all three cases). + +- [ ] **Step 5: Commit** + +```bash +git add packages/landlord/src/workflow/agentcall.ts packages/landlord/test/workflow/agentcall.test.ts +git commit -m "feat(landlord): agent() hook with schema, isolation, journaling" +``` + +--- + +## Task 10: The workflow context (`buildContext`) + +**Files:** +- Create: `packages/landlord/src/workflow/hooks.ts` +- Test: `packages/landlord/test/workflow/hooks.test.ts` + +**Context:** `buildContext` assembles the `WorkflowContext`. The nested `workflow()` implementation is injected by the runtime (Task 13) as `workflowFn` to avoid a hooks↔runtime import cycle. + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/hooks.test.ts +import { describe, expect, it } from 'vitest'; +import { WorkflowBudget } from '../../src/workflow/budget.ts'; +import { EventEmitter } from '../../src/workflow/events.ts'; +import { buildContext } from '../../src/workflow/hooks.ts'; +import type { RunDeps } from '../../src/workflow/agentcall.ts'; + +function fakeDeps(): RunDeps { + return { + emitter: new EventEmitter(), + wfBudget: new WorkflowBudget(100), + args: { topic: 'x' }, + currentPhase: { value: undefined }, + // unused-by-these-tests fields: + } as unknown as RunDeps; +} + +describe('buildContext combinators', () => { + it('parallel maps a throwing thunk to null', async () => { + const ctx = buildContext(fakeDeps(), async () => null); + const out = await ctx.parallel([ + async () => 1, + async () => { + throw new Error('x'); + }, + ]); + expect(out).toEqual([1, null]); + }); + + it('pipeline runs stages per item and drops a throwing item to null', async () => { + const ctx = buildContext(fakeDeps(), async () => null); + const out = await ctx.pipeline( + [1, 2], + (prev) => (prev as number) + 1, + (prev, original, i) => { + if (original === 2) throw new Error('boom'); + return `${prev}@${i}`; + }, + ); + expect(out).toEqual(['2@0', null]); + }); + + it('phase and log emit events; budget and args are exposed', async () => { + const deps = fakeDeps(); + const ctx = buildContext(deps, async () => null); + ctx.phase('Find'); + ctx.log('looking'); + expect(deps.emitter.all().map((e) => e.type)).toEqual(['phase_started', 'log']); + expect(ctx.budget.total).toBe(100); + expect(ctx.args).toEqual({ topic: 'x' }); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/hooks.test.ts` +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `hooks.ts`** + +```ts +// src/workflow/hooks.ts +import { runAgentCall } from './agentcall.ts'; +import type { RunDeps } from './agentcall.ts'; +import { budgetView } from './budget.ts'; +import type { AgentOpts, StageFn, WorkflowContext } from './types.ts'; + +export function buildContext( + deps: RunDeps, + workflowFn: WorkflowContext['workflow'], +): WorkflowContext { + const agent = (prompt: string, opts?: AgentOpts): Promise => { + const phase = opts?.phase ?? deps.currentPhase.value; + const merged: AgentOpts = { ...(opts ?? {}), ...(phase !== undefined ? { phase } : {}) }; + return runAgentCall(prompt, merged, deps); + }; + + const parallel = async (thunks: Array<() => Promise>): Promise> => + Promise.all( + thunks.map(async (thunk) => { + try { + return await thunk(); + } catch { + return null; + } + }), + ); + + const pipeline = async (items: unknown[], ...stages: StageFn[]): Promise => + Promise.all( + items.map(async (item, index) => { + let acc: unknown = item; + for (const stage of stages) { + try { + acc = await stage(acc, item, index); + } catch { + return null; + } + } + return acc; + }), + ); + + const phase = (title: string): void => { + deps.currentPhase.value = title; + deps.emitter.emit({ type: 'phase_started', title }); + }; + + const log = (message: string): void => { + deps.emitter.emit({ type: 'log', message }); + }; + + return { + agent, + parallel, + pipeline, + phase, + log, + args: deps.args, + budget: budgetView(deps.wfBudget), + workflow: workflowFn, + }; +} +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/hooks.test.ts` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add packages/landlord/src/workflow/hooks.ts packages/landlord/test/workflow/hooks.test.ts +git commit -m "feat(landlord): workflow context (parallel/pipeline/phase/log)" +``` + +--- + +## Task 11: Meta parser + determinism sandbox + +**Files:** +- Create: `packages/landlord/src/workflow/meta.ts` +- Create: `packages/landlord/src/workflow/sandbox.ts` +- Test: `packages/landlord/test/workflow/meta.test.ts` +- Test: `packages/landlord/test/workflow/sandbox.test.ts` + +- [ ] **Step 1: Write the failing tests** + +```ts +// test/workflow/meta.test.ts +import { describe, expect, it } from 'vitest'; +import { MetaError } from '../../src/workflow/errors.ts'; +import { parseMeta } from '../../src/workflow/meta.ts'; + +describe('parseMeta', () => { + it('parses a pure object literal with nested arrays', () => { + const meta = parseMeta( + `export const meta = { name: 'rev', description: "Review", phases: [{ title: 'A' }, { title: 'B', detail: 'x' }] }\nphase('A')`, + ); + expect(meta.name).toBe('rev'); + expect(meta.description).toBe('Review'); + expect(meta.phases).toEqual([{ title: 'A' }, { title: 'B', detail: 'x' }]); + }); + + it('rejects a non-literal value (function call) in meta', () => { + expect(() => parseMeta(`export const meta = { name: foo(), description: 'x' }`)).toThrow(MetaError); + }); + + it('rejects meta missing name/description', () => { + expect(() => parseMeta(`export const meta = { name: 'x' }`)).toThrow(MetaError); + }); +}); +``` + +```ts +// test/workflow/sandbox.test.ts +import { describe, expect, it } from 'vitest'; +import { sandboxBindings } from '../../src/workflow/sandbox.ts'; + +describe('sandboxBindings', () => { + it('blocks Date, Math.random, and process but allows pure Math', () => { + const b = sandboxBindings(); + const D = b['Date'] as { now: () => number }; + const M = b['Math'] as Math; + const P = b['process'] as { cwd: () => string }; + expect(() => D.now()).toThrow(); + expect(() => new (b['Date'] as unknown as new () => unknown)()).toThrow(); + expect(() => M.random()).toThrow(); + expect(M.floor(3.7)).toBe(3); + expect(() => P.cwd()).toThrow(); + }); +}); +``` + +- [ ] **Step 2: Run tests to verify they fail** + +Run: `pnpm vitest run test/workflow/meta.test.ts test/workflow/sandbox.test.ts` +Expected: FAIL — modules not found. + +- [ ] **Step 3: Write `meta.ts`** + +```ts +// src/workflow/meta.ts +import { MetaError } from './errors.ts'; +import type { Meta } from './types.ts'; + +function skipWs(s: string, i: number): number { + while (i < s.length) { + const c = s[i]; + if (c === ' ' || c === '\n' || c === '\t' || c === '\r') i++; + else break; + } + return i; +} + +function parseString(s: string, i: number): { value: string; end: number } { + const quote = s[i]; + i++; + let out = ''; + while (i < s.length && s[i] !== quote) { + if (s[i] === '\\') { + const n = s[i + 1]; + out += + n === 'n' ? '\n' + : n === 't' ? '\t' + : n === 'r' ? '\r' + : n === '\\' ? '\\' + : n === quote ? quote + : (n ?? ''); + i += 2; + } else { + out += s[i]; + i++; + } + } + if (s[i] !== quote) throw new MetaError('Unterminated string in meta literal'); + return { value: out, end: i + 1 }; +} + +function parseNumber(s: string, i: number): { value: number; end: number } { + const m = /^-?\d+(\.\d+)?([eE][+-]?\d+)?/.exec(s.slice(i)); + if (!m) throw new MetaError('Invalid number in meta literal'); + return { value: Number(m[0]), end: i + m[0].length }; +} + +function parseKey(s: string, i: number): { value: string; end: number } { + const c = s[i]; + if (c === '"' || c === "'") return parseString(s, i); + const m = /^[A-Za-z_$][\w$]*/.exec(s.slice(i)); + if (!m) throw new MetaError(`Invalid object key at index ${i}`); + return { value: m[0], end: i + m[0].length }; +} + +export function parseLiteral(s: string, start = 0): { value: unknown; end: number } { + const i = skipWs(s, start); + const ch = s[i]; + if (ch === '{') { + let j = skipWs(s, i + 1); + const obj: Record = {}; + if (s[j] === '}') return { value: obj, end: j + 1 }; + while (j < s.length) { + j = skipWs(s, j); + const key = parseKey(s, j); + j = skipWs(s, key.end); + if (s[j] !== ':') throw new MetaError(`Expected ':' at index ${j}`); + const val = parseLiteral(s, j + 1); + obj[key.value] = val.value; + j = skipWs(s, val.end); + if (s[j] === ',') { + j = skipWs(s, j + 1); + if (s[j] === '}') return { value: obj, end: j + 1 }; + continue; + } + if (s[j] === '}') return { value: obj, end: j + 1 }; + throw new MetaError(`Expected ',' or '}' at index ${j}`); + } + throw new MetaError('Unterminated object in meta literal'); + } + if (ch === '[') { + let j = skipWs(s, i + 1); + const arr: unknown[] = []; + if (s[j] === ']') return { value: arr, end: j + 1 }; + while (j < s.length) { + const val = parseLiteral(s, j); + arr.push(val.value); + j = skipWs(s, val.end); + if (s[j] === ',') { + j = skipWs(s, j + 1); + if (s[j] === ']') return { value: arr, end: j + 1 }; + continue; + } + if (s[j] === ']') return { value: arr, end: j + 1 }; + throw new MetaError(`Expected ',' or ']' at index ${j}`); + } + throw new MetaError('Unterminated array in meta literal'); + } + if (ch === '"' || ch === "'") return parseString(s, i); + if (ch === '-' || (ch !== undefined && ch >= '0' && ch <= '9')) return parseNumber(s, i); + if (s.startsWith('true', i)) return { value: true, end: i + 4 }; + if (s.startsWith('false', i)) return { value: false, end: i + 5 }; + if (s.startsWith('null', i)) return { value: null, end: i + 4 }; + throw new MetaError(`Unexpected token in meta literal at index ${i}: '${s.slice(i, i + 12)}'`); +} + +export function parseMeta(source: string): Meta { + const m = /export\s+const\s+meta\s*=/.exec(source); + if (!m) throw new MetaError('Script is missing `export const meta = { ... }`'); + const { value } = parseLiteral(source, m.index + m[0].length); + if (typeof value !== 'object' || value === null || Array.isArray(value)) { + throw new MetaError('meta must be an object literal'); + } + const meta = value as Record; + if (typeof meta['name'] !== 'string' || typeof meta['description'] !== 'string') { + throw new MetaError('meta requires string `name` and `description`'); + } + return meta as unknown as Meta; +} +``` + +- [ ] **Step 4: Write `sandbox.ts`** + +```ts +// src/workflow/sandbox.ts +import { WorkflowError } from './errors.ts'; + +function forbidden(name: string): never { + throw new WorkflowError( + `${name} is not available inside a workflow script (nondeterministic or host access)`, + 'workflow.sandbox', + ); +} + +function blockedCallable(name: string): unknown { + const fn = (): never => forbidden(name); + return new Proxy(fn, { + apply: () => forbidden(name), + construct: () => forbidden(name), + get: (_t, prop) => { + if (prop === 'prototype') return undefined; + return () => forbidden(name); + }, + }); +} + +export function sandboxBindings(): Record { + const safeMath = new Proxy(Math, { + get: (target, prop) => { + if (prop === 'random') return () => forbidden('Math.random'); + return Reflect.get(target, prop); + }, + }); + return { + Date: blockedCallable('Date'), + Math: safeMath, + process: blockedCallable('process'), + require: blockedCallable('require'), + globalThis: blockedCallable('globalThis'), + global: blockedCallable('global'), + fs: blockedCallable('fs'), + eval: blockedCallable('eval'), + Function: blockedCallable('Function'), + }; +} +``` + +- [ ] **Step 5: Run tests to verify they pass** + +Run: `pnpm vitest run test/workflow/meta.test.ts test/workflow/sandbox.test.ts` +Expected: PASS. + +- [ ] **Step 6: Commit** + +```bash +git add packages/landlord/src/workflow/meta.ts packages/landlord/src/workflow/sandbox.ts packages/landlord/test/workflow/meta.test.ts packages/landlord/test/workflow/sandbox.test.ts +git commit -m "feat(landlord): meta literal parser and determinism sandbox" +``` + +--- + +## Task 12: Script compiler + typed authoring + +**Files:** +- Create: `packages/landlord/src/workflow/script.ts` +- Create: `packages/landlord/src/workflow/define.ts` +- Test: `packages/landlord/test/workflow/script.test.ts` + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/script.test.ts +import { describe, expect, it } from 'vitest'; +import { compileScript } from '../../src/workflow/script.ts'; +import { defineWorkflow } from '../../src/workflow/define.ts'; +import { MetaError } from '../../src/workflow/errors.ts'; +import type { WorkflowContext } from '../../src/workflow/types.ts'; + +function fakeCtx(calls: string[]): WorkflowContext { + return { + agent: async (p) => { + calls.push(`agent:${p}`); + return 'R'; + }, + parallel: async (thunks) => Promise.all(thunks.map((t) => t())), + pipeline: async (items) => items, + phase: () => {}, + log: (m) => calls.push(`log:${m}`), + args: { n: 2 }, + budget: { total: null, spent: () => 0, remaining: () => Number.POSITIVE_INFINITY }, + workflow: async () => null, + }; +} + +describe('compileScript', () => { + it('parses meta, injects hooks, supports top-level await and return', async () => { + const mod = compileScript( + `export const meta = { name: 'x', description: 'y' }\nlog('hi')\nconst r = await agent('do ' + args.n)\nreturn r`, + ); + expect(mod.meta.name).toBe('x'); + const calls: string[] = []; + const result = await mod.run(fakeCtx(calls)); + expect(result).toBe('R'); + expect(calls).toEqual(['log:hi', 'agent:do 2']); + }); + + it('blocks nondeterministic globals at runtime', async () => { + const mod = compileScript(`export const meta = { name: 'a', description: 'b' }\nreturn Date.now()`); + await expect(mod.run(fakeCtx([]))).rejects.toThrow(); + }); +}); + +describe('defineWorkflow', () => { + it('returns the module and validates meta', () => { + const mod = defineWorkflow({ meta: { name: 'm', description: 'd' }, run: async () => 42 }); + expect(mod.meta.name).toBe('m'); + expect(() => defineWorkflow({ meta: { name: 'm' } as never, run: async () => 1 })).toThrow(MetaError); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/script.test.ts` +Expected: FAIL — modules not found. + +- [ ] **Step 3: Write `script.ts`** + +```ts +// src/workflow/script.ts +import { parseMeta } from './meta.ts'; +import { sandboxBindings } from './sandbox.ts'; +import type { WorkflowContext, WorkflowModule } from './types.ts'; + +const AsyncFunction = Object.getPrototypeOf(async () => {}).constructor as new ( + ...args: string[] +) => (...args: unknown[]) => Promise; + +export function stripModuleSyntax(source: string): string { + let body = source.replace(/export\s+const\s+meta\s*=/, 'const __meta__ ='); + body = body.replace(/^\s*import\s.*$/gm, ''); + body = body.replace(/export\s+default\s+/g, 'return '); + body = body.replace(/export\s+(const|let|var|function|class)\s/g, '$1 '); + return body; +} + +export function compileScript(source: string): WorkflowModule { + const meta = parseMeta(source); + const body = stripModuleSyntax(source); + const sandbox = sandboxBindings(); + const sandboxNames = Object.keys(sandbox); + const sandboxValues = sandboxNames.map((n) => sandbox[n]); + const hookNames = ['agent', 'parallel', 'pipeline', 'phase', 'log', 'args', 'budget', 'workflow']; + const fn = new AsyncFunction(...hookNames, ...sandboxNames, `'use strict';\n${body}`); + const run = (wf: WorkflowContext): Promise => + fn( + wf.agent, + wf.parallel, + wf.pipeline, + wf.phase, + wf.log, + wf.args, + wf.budget, + wf.workflow, + ...sandboxValues, + ); + return { meta, run }; +} +``` + +- [ ] **Step 4: Write `define.ts`** + +```ts +// src/workflow/define.ts +import { MetaError } from './errors.ts'; +import type { WorkflowModule } from './types.ts'; + +export function defineWorkflow(def: WorkflowModule): WorkflowModule { + if ( + def.meta === undefined || + typeof def.meta.name !== 'string' || + typeof def.meta.description !== 'string' + ) { + throw new MetaError('defineWorkflow requires meta with string name and description'); + } + if (typeof def.run !== 'function') { + throw new MetaError('defineWorkflow requires a run function'); + } + return def; +} +``` + +- [ ] **Step 5: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/script.test.ts` +Expected: PASS. + +- [ ] **Step 6: Commit** + +```bash +git add packages/landlord/src/workflow/script.ts packages/landlord/src/workflow/define.ts packages/landlord/test/workflow/script.test.ts +git commit -m "feat(landlord): script compiler and typed defineWorkflow" +``` + +--- + +## Task 13: The run engine (`runtime.ts`) + +**Files:** +- Create: `packages/landlord/src/workflow/runtime.ts` +- Test: `packages/landlord/test/workflow/runtime.test.ts` + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/runtime.test.ts +import type { NormalizedResponse } from 'flint'; +import { mockAdapter, scriptedAdapter } from 'flint/testing'; +import { describe, expect, it } from 'vitest'; +import { memoryJournalStore } from '../../src/workflow/journal.ts'; +import { runWorkflow, runWorkflowScript } from '../../src/workflow/runtime.ts'; +import { defineWorkflow } from '../../src/workflow/define.ts'; + +function textResponse(content: string): NormalizedResponse { + return { message: { role: 'assistant', content }, usage: { input: 10, output: 5 }, stopReason: 'end' }; +} + +describe('runWorkflow', () => { + it('runs a single-agent script and reports events', async () => { + const adapter = scriptedAdapter([textResponse('hello')]); + const res = await runWorkflowScript( + `export const meta = { name: 'r', description: 'd' }\nreturn await agent('hi')`, + { adapter, models: { default: 'm' } }, + ); + expect(res.ok).toBe(true); + if (res.ok) { + expect(res.value.result).toBe('hello'); + expect(res.value.events.map((e) => e.type)).toContain('workflow_complete'); + } + }); + + it('replays from a prior run without calling the adapter (resume)', async () => { + const journal = memoryJournalStore(); + const source = `export const meta = { name: 'r', description: 'd' }\nreturn await agent('hi')`; + const r1 = await runWorkflowScript(source, { + adapter: scriptedAdapter([textResponse('hello')]), + models: { default: 'm' }, + journal, + runId: 'run1', + }); + expect(r1.ok && r1.value.result).toBe('hello'); + + const throwing = mockAdapter({ onCall: () => { throw new Error('must not be called'); } }); + const r2 = await runWorkflowScript(source, { + adapter: throwing, + models: { default: 'm' }, + journal, + runId: 'run2', + resumeFromRunId: 'run1', + }); + expect(r2.ok && r2.value.result).toBe('hello'); + }); + + it('runs a typed workflow via runWorkflow', async () => { + const mod = defineWorkflow({ + meta: { name: 't', description: 'd' }, + run: async (wf) => { + wf.phase('Work'); + return wf.budget.total; + }, + }); + const res = await runWorkflow(mod, { + adapter: scriptedAdapter([]), + models: { default: 'm' }, + tokenTarget: 500, + }); + expect(res.ok && res.value.result).toBe(500); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/runtime.test.ts` +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `runtime.ts`** + +```ts +// src/workflow/runtime.ts +import { randomUUID } from 'node:crypto'; +import { mkdir } from 'node:fs/promises'; +import { tmpdir } from 'node:os'; +import { join } from 'node:path'; +import type { ProviderAdapter, Result } from 'flint'; +import { budget as makeBudget } from 'flint/budget'; +import type { Budget } from 'flint/budget'; +import type { RunDeps } from './agentcall.ts'; +import { WorkflowBudget } from './budget.ts'; +import { AgentCounter, Semaphore, defaultConcurrency } from './concurrency.ts'; +import { WorkflowError } from './errors.ts'; +import { EventEmitter } from './events.ts'; +import type { EventSink } from './events.ts'; +import { buildContext } from './hooks.ts'; +import { gitWorktreeIsolation, workdirIsolation } from './isolation.ts'; +import type { IsolationBackend } from './isolation.ts'; +import { memoryJournalStore } from './journal.ts'; +import type { JournalStore } from './journal.ts'; +import { createAgentRegistry } from './registry.ts'; +import type { AgentTypeRegistry, WorkflowRegistry } from './registry.ts'; +import { compileScript } from './script.ts'; +import type { + Models, + WorkflowContext, + WorkflowModule, + WorkflowRunResult, +} from './types.ts'; + +export type RuntimeConfig = { + adapter: ProviderAdapter; + models: Models; + args?: unknown; + budget?: Budget; + tokenTarget?: number | null; + registry?: AgentTypeRegistry; + workflows?: WorkflowRegistry; + journal?: JournalStore; + isolation?: IsolationBackend; + worktreeRepoDir?: string; + baseDir?: string; + concurrency?: number; + agentCap?: number; + onEvent?: EventSink; + signal?: AbortSignal; + runId?: string; + resumeFromRunId?: string; +}; + +async function buildDeps(config: RuntimeConfig): Promise { + const runId = config.runId ?? randomUUID().slice(0, 8); + const baseDir = config.baseDir ?? join(tmpdir(), `flint-workflow-${runId}`); + await mkdir(baseDir, { recursive: true }); + const journal = config.journal ?? memoryJournalStore(); + const resumeEntries = + config.resumeFromRunId !== undefined ? await journal.load(config.resumeFromRunId) : []; + let index = 0; + return { + adapter: config.adapter, + models: config.models, + flintBudget: config.budget ?? makeBudget({ maxSteps: 1_000_000 }), + wfBudget: new WorkflowBudget(config.tokenTarget ?? null), + semaphore: new Semaphore(config.concurrency ?? defaultConcurrency()), + counter: new AgentCounter(config.agentCap ?? 1000), + registry: config.registry ?? createAgentRegistry(), + workflows: config.workflows, + isolation: config.isolation ?? workdirIsolation(baseDir), + worktreeIsolation: + config.worktreeRepoDir !== undefined + ? gitWorktreeIsolation(config.worktreeRepoDir, baseDir) + : undefined, + emitter: new EventEmitter(config.onEvent), + journal, + runId, + resumeEntries, + signal: config.signal, + args: config.args, + depth: 0, + nextIndex: () => index++, + currentPhase: { value: undefined }, + }; +} + +function resolveSource( + ref: string | { scriptPath?: string; source?: string }, + workflows: WorkflowRegistry | undefined, +): string { + if (typeof ref === 'string') { + const src = workflows?.resolve(ref); + if (src === undefined) throw new WorkflowError(`Unknown workflow '${ref}'`, 'workflow.unknown'); + return src; + } + if (ref.source !== undefined) return ref.source; + throw new WorkflowError( + 'workflow(): provide a registered name or { source }; { scriptPath } must be read by the caller.', + 'workflow.unknown', + ); +} + +function executeModule(module: WorkflowModule, deps: RunDeps): Promise { + const workflowFn: WorkflowContext['workflow'] = async (ref, childArgs) => { + if (deps.depth >= 1) { + throw new WorkflowError('workflow() nesting is one level only', 'workflow.nesting'); + } + const child = compileScript(resolveSource(ref, deps.workflows)); + return executeModule(child, { ...deps, depth: deps.depth + 1, args: childArgs }); + }; + return module.run(buildContext(deps, workflowFn)); +} + +export async function runWorkflow( + module: WorkflowModule, + config: RuntimeConfig, +): Promise> { + const deps = await buildDeps(config); + try { + const result = await executeModule(module, deps); + deps.emitter.emit({ type: 'workflow_complete', result }); + return { ok: true, value: { runId: deps.runId, result, events: deps.emitter.all() } }; + } catch (e) { + return { ok: false, error: e instanceof Error ? e : new Error(String(e)) }; + } +} + +export async function runWorkflowScript( + source: string, + config: RuntimeConfig, +): Promise> { + let module: WorkflowModule; + try { + module = compileScript(source); + } catch (e) { + return { ok: false, error: e instanceof Error ? e : new Error(String(e)) }; + } + return runWorkflow(module, config); +} +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/runtime.test.ts && pnpm typecheck` +Expected: PASS; typecheck clean. + +- [ ] **Step 5: Commit** + +```bash +git add packages/landlord/src/workflow/runtime.ts packages/landlord/test/workflow/runtime.test.ts +git commit -m "feat(landlord): workflow run engine with resume and nesting" +``` + +## Task 14: Model-facing `workflowTool` + guide + +**Files:** +- Create: `packages/landlord/src/workflow/tool.ts` +- Test: `packages/landlord/test/workflow/tool.test.ts` + +- [ ] **Step 1: Write the failing test** + +```ts +// test/workflow/tool.test.ts +import type { NormalizedResponse } from 'flint'; +import { execute } from 'flint'; +import { scriptedAdapter } from 'flint/testing'; +import { describe, expect, it } from 'vitest'; +import { WORKFLOW_TOOL_GUIDE, workflowTool } from '../../src/workflow/tool.ts'; + +function textResponse(content: string): NormalizedResponse { + return { message: { role: 'assistant', content }, usage: { input: 10, output: 5 }, stopReason: 'end' }; +} + +describe('workflowTool', () => { + it('runs a script supplied as tool input and returns runId + result', async () => { + const adapter = scriptedAdapter([textResponse('inner-result')]); + const tool = workflowTool({ adapter, models: { default: 'm' } }); + const res = await execute(tool, { + script: `export const meta = { name: 'x', description: 'y' }\nreturn await agent('go')`, + }); + expect(res.ok).toBe(true); + if (res.ok) { + const parsed = JSON.parse(res.value as string); + expect(parsed.result).toBe('inner-result'); + expect(typeof parsed.runId).toBe('string'); + } + }); + + it('errors clearly when neither script nor name is provided', async () => { + const tool = workflowTool({ adapter: scriptedAdapter([]), models: { default: 'm' } }); + const res = await execute(tool, {}); + expect(res.ok).toBe(true); + expect(String(res.ok ? res.value : '')).toMatch(/provide either/i); + }); +}); + +describe('WORKFLOW_TOOL_GUIDE', () => { + it('documents the core hooks', () => { + expect(WORKFLOW_TOOL_GUIDE).toMatch(/pipeline/); + expect(WORKFLOW_TOOL_GUIDE).toMatch(/parallel/); + expect(WORKFLOW_TOOL_GUIDE).toMatch(/schema/); + }); +}); +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `pnpm vitest run test/workflow/tool.test.ts` +Expected: FAIL — module not found. + +- [ ] **Step 3: Write `tool.ts`** + +```ts +// src/workflow/tool.ts +import { agent, tool } from 'flint'; +import type { ProviderAdapter, Result, Tool } from 'flint'; +import { budget as makeBudget } from 'flint/budget'; +import type { Budget } from 'flint/budget'; +import { z } from 'zod'; +import type { EventSink } from './events.ts'; +import type { IsolationBackend } from './isolation.ts'; +import type { JournalStore } from './journal.ts'; +import type { AgentTypeRegistry, WorkflowRegistry } from './registry.ts'; +import { runWorkflowScript } from './runtime.ts'; +import type { RuntimeConfig } from './runtime.ts'; +import type { Models } from './types.ts'; + +export type WorkflowToolConfig = { + adapter: ProviderAdapter; + models: Models; + registry?: AgentTypeRegistry; + workflows?: WorkflowRegistry; + journal?: JournalStore; + isolation?: IsolationBackend; + onEvent?: EventSink; +}; + +const workflowToolSchema = z.object({ + script: z.string().optional(), + args: z.unknown().optional(), + name: z.string().optional(), + resumeFromRunId: z.string().optional(), +}); + +const WORKFLOW_TOOL_JSON_SCHEMA = { + type: 'object', + properties: { + script: { + type: 'string', + description: 'A workflow JS script beginning with `export const meta = { ... }`.', + }, + args: { description: 'Optional value exposed to the script as `args`.' }, + name: { type: 'string', description: 'Name of a registered workflow to run instead of `script`.' }, + resumeFromRunId: { type: 'string', description: 'Resume a prior run, replaying unchanged agents.' }, + }, +}; + +export function workflowTool(config: WorkflowToolConfig): Tool { + return tool({ + name: 'workflow', + description: + 'Author and run a dynamic multi-agent workflow. Provide a `script` that orchestrates ' + + 'subagents with agent()/parallel()/pipeline()/phase()/log()/budget()/workflow(). ' + + 'Returns JSON { runId, result }.', + input: workflowToolSchema, + jsonSchema: WORKFLOW_TOOL_JSON_SCHEMA, + handler: async (input) => { + let source = input.script; + if (source === undefined && input.name !== undefined) { + source = config.workflows?.resolve(input.name); + } + if (source === undefined) { + return 'Error: provide either a `script` string or a registered `name`.'; + } + const runtimeConfig: RuntimeConfig = { + adapter: config.adapter, + models: config.models, + ...(config.registry !== undefined ? { registry: config.registry } : {}), + ...(config.workflows !== undefined ? { workflows: config.workflows } : {}), + ...(config.journal !== undefined ? { journal: config.journal } : {}), + ...(config.isolation !== undefined ? { isolation: config.isolation } : {}), + ...(config.onEvent !== undefined ? { onEvent: config.onEvent } : {}), + ...(input.args !== undefined ? { args: input.args } : {}), + ...(input.resumeFromRunId !== undefined ? { resumeFromRunId: input.resumeFromRunId } : {}), + }; + const res = await runWorkflowScript(source, runtimeConfig); + if (!res.ok) return `Error: ${res.error.message}`; + return JSON.stringify({ runId: res.value.runId, result: res.value.result }); + }, + }) as unknown as Tool; +} + +export function orchestratorAgent(config: WorkflowToolConfig) { + const wt = workflowTool(config); + return (prompt: string, opts?: { budget?: Budget; model?: string }): ReturnType => + agent({ + adapter: config.adapter, + model: opts?.model ?? config.models.default, + messages: [ + { role: 'system', content: WORKFLOW_TOOL_GUIDE }, + { role: 'user', content: prompt }, + ], + tools: [wt], + budget: opts?.budget ?? makeBudget({ maxSteps: 50 }), + }); +} + +export const WORKFLOW_TOOL_GUIDE = `You can orchestrate subagents by writing a workflow script and running it with the \`workflow\` tool. + +A script begins with a pure-literal meta block, then a body using injected hooks: + + export const meta = { name: 'review', description: 'Review changes and verify findings' } + phase('Find') + const findings = await parallel(FINDERS.map(f => () => agent(f.prompt, { schema: FINDINGS }))) + return findings.flat().filter(Boolean) + +Hooks available in the script: +- agent(prompt, opts?) — spawn a subagent. Without a schema it returns the agent's final text; with { schema } (a JSON Schema) it is forced to return a validated object. opts: { label, phase, schema, model, isolation: 'worktree', agentType }. +- parallel(thunks) — run thunks concurrently. This is a BARRIER: it awaits all of them. A thunk that throws becomes null in the result array, so filter(Boolean) before use. +- pipeline(items, ...stages) — run each item through every stage independently, with NO barrier between stages. Each stage receives (prevResult, originalItem, index). A throwing stage drops that item to null. This is the DEFAULT for multi-stage work. +- phase(title) / log(message) — progress grouping and narration. +- args — the input value passed to the run. +- budget — { total, spent(), remaining() } in output tokens; total may be null. Use for loops: while (budget.total && budget.remaining() > 50000) { ... }. +- workflow(nameOrRef, args?) — run another registered workflow inline (one level only). + +Determinism: Date.now(), new Date(), and Math.random() are unavailable inside scripts (they throw) so runs can be resumed. Pass timestamps via args; vary by index for pseudo-randomness. + +Concurrency is capped automatically; the total number of agents per run is capped at 1000. + +Default to pipeline() — only use a barrier (parallel between stages) when stage N genuinely needs all of stage N-1's results at once (dedup/merge, early-exit on zero, cross-item comparison). + +Quality patterns to compose as the task warrants: +- Adversarial verify: spawn independent skeptics per finding, each prompted to REFUTE; keep only findings that survive a majority. +- Judge panel: generate N independent attempts from different angles, score with parallel judges, synthesize from the winner. +- Loop-until-dry: keep spawning finders until K consecutive rounds surface nothing new. +- Multi-modal sweep: parallel agents each searching a different way; each blind to the others. +- Completeness critic: a final agent that asks "what's missing?" — its answer becomes the next round of work. + +Scale effort to the request: a quick check needs a few agents and single-vote verification; "thoroughly audit this" warrants a larger finder pool plus a 3–5 vote adversarial pass and a synthesis stage.`; +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `pnpm vitest run test/workflow/tool.test.ts` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add packages/landlord/src/workflow/tool.ts packages/landlord/test/workflow/tool.test.ts +git commit -m "feat(landlord): workflowTool, orchestratorAgent, and tool guide" +``` + +--- + +## Task 15: Wiring + `orchestrate()` rebuilt on the runtime + +**Files:** +- Create: `packages/landlord/src/workflow/index.ts` +- Modify: `packages/landlord/src/orchestrate.ts` (full rewrite of the `orchestrate()` function; helpers/types unchanged) +- Modify: `packages/landlord/src/index.ts` +- Modify: `packages/landlord/package.json` +- Modify: `packages/landlord/tsup.config.ts` + +**Backward-compat constraint:** `test/orchestrate.test.ts`, `test/tenant.test.ts`, `test/decompose.test.ts`, `test/validate.test.ts`, and `test/contract.test.ts` must pass **unchanged**. The rewrite keeps `decompose()`, `resolveOrder()`, `DependencyCycleError`, all event names, and the `OrchestrateResult` shape identical; it only moves the tenant scheduling inside a runtime run so tenants share the semaphore/runId. + +- [ ] **Step 1: Write `workflow/index.ts`** + +```ts +// src/workflow/index.ts +export { runWorkflow, runWorkflowScript } from './runtime.ts'; +export type { RuntimeConfig } from './runtime.ts'; +export { defineWorkflow } from './define.ts'; +export { compileScript, stripModuleSyntax } from './script.ts'; +export { parseMeta, parseLiteral } from './meta.ts'; +export { sandboxBindings } from './sandbox.ts'; +export { workflowTool, orchestratorAgent, WORKFLOW_TOOL_GUIDE } from './tool.ts'; +export type { WorkflowToolConfig } from './tool.ts'; +export { + createAgentRegistry, + createWorkflowRegistry, + BUILT_IN_AGENT_TYPES, +} from './registry.ts'; +export type { AgentType, AgentTypeRegistry, WorkflowRegistry } from './registry.ts'; +export { memoryJournalStore, fileJournalStore, hashCall } from './journal.ts'; +export type { JournalEntry, JournalStore } from './journal.ts'; +export { workdirIsolation, gitWorktreeIsolation } from './isolation.ts'; +export type { IsolationBackend, IsolationLease } from './isolation.ts'; +export { WorkflowBudget, budgetView } from './budget.ts'; +export { Semaphore, AgentCounter, defaultConcurrency } from './concurrency.ts'; +export { EventEmitter } from './events.ts'; +export type { EventSink } from './events.ts'; +export { WorkflowError, AgentCapError, MetaError } from './errors.ts'; +export type { + AgentOpts, + Meta, + MetaPhase, + Models, + StageFn, + WorkflowBudgetView, + WorkflowContext, + WorkflowEvent, + WorkflowModule, + WorkflowRunResult, +} from './types.ts'; +``` + +- [ ] **Step 2: Rewrite `src/orchestrate.ts`** + +Replace the entire file with the following. The `DependencyCycleError`, `resolveOrder`, and all exported types are reproduced unchanged; only `orchestrate()` is rebuilt to run inside `runWorkflow`. + +```ts +// src/orchestrate.ts +import { mkdir } from 'node:fs/promises'; +import { tmpdir } from 'node:os'; +import { join } from 'node:path'; +import type { ProviderAdapter, Result, Tool } from 'flint'; +import type { Budget } from 'flint/budget'; +import type { Contract } from './contract.ts'; +import { decompose } from './decompose.ts'; +import { runTenant } from './tenant.ts'; +import { defineWorkflow } from './workflow/define.ts'; +import { runWorkflow } from './workflow/runtime.ts'; + +export class DependencyCycleError extends Error { + constructor(message: string) { + super(message); + this.name = 'DependencyCycleError'; + } +} + +export function resolveOrder(contracts: Contract[]): Contract[] { + const byRole = new Map(contracts.map((c) => [c.role, c])); + const WHITE = 0; + const GRAY = 1; + const BLACK = 2; + const color = new Map(contracts.map((c) => [c.role, WHITE])); + const order: Contract[] = []; + + function visit(role: string, stack: string[]): void { + if (color.get(role) === GRAY) { + throw new DependencyCycleError(`Dependency cycle: ${[...stack, role].join(' -> ')}`); + } + if (color.get(role) === BLACK) return; + const entry = byRole.get(role); + if (!entry) return; + color.set(role, GRAY); + for (const dep of entry.dependsOn) { + visit(dep, [...stack, role]); + } + color.set(role, BLACK); + order.push(entry); + } + + for (const c of contracts) visit(c.role, []); + return order; +} + +export type TenantOutcome = + | { status: 'complete'; artifacts: Record } + | { status: 'escalated'; lastError: string; retriesExhausted: number }; + +export type OrchestrateResult = { + status: 'complete' | 'partial'; + tenants: Record; + artifacts: Record>; +}; + +export type LandlordEvent = + | { type: 'tenant_started'; role: string } + | { type: 'checkpoint_passed'; role: string; checkpoint: string } + | { type: 'checkpoint_failed'; role: string; checkpoint: string; reason: string } + | { type: 'tenant_complete'; role: string } + | { type: 'tenant_evicted'; role: string; reason: string; retry: number } + | { type: 'tenant_escalated'; role: string } + | { type: 'job_complete'; artifacts: Record> }; + +export type OrchestratorConfig = { + adapter: ProviderAdapter; + landlordModel: string; + tenantModel: string; + /** Shared job-level budget consumed by ALL tenants and the landlord decompose call. */ + budget?: Budget; + outputDir?: string; + onEvent?: (event: LandlordEvent) => void; +}; + +export async function orchestrate( + prompt: string, + toolsFactory: (workDir: string) => Tool[], + config: OrchestratorConfig, +): Promise> { + const decomposeResult = await decompose(prompt, { + adapter: config.adapter, + model: config.landlordModel, + ...(config.budget !== undefined ? { budget: config.budget } : {}), + }); + if (!decomposeResult.ok) return decomposeResult; + const plan = decomposeResult.value; + + try { + resolveOrder(plan); + } catch (e) { + return { ok: false, error: e instanceof Error ? e : new Error(String(e)) }; + } + + const baseOutputDir = config.outputDir ?? join(tmpdir(), `landlord-${Date.now()}`); + await mkdir(join(baseOutputDir, 'shared'), { recursive: true }); + + const module = defineWorkflow({ + meta: { name: 'auto-decompose', description: 'Landlord auto-decomposition orchestration' }, + run: async (wf): Promise => { + const gates = new Map< + string, + { promise: Promise>; resolve: (v: Record) => void } + >(); + for (const c of plan) { + let resolve!: (v: Record) => void; + const promise = new Promise>((r) => { + resolve = r; + }); + gates.set(c.role, { promise, resolve }); + } + + const escalatedRoles = new Set(); + const tenantOutcomes: Record = {}; + const jobArtifacts: Record> = {}; + + async function runWithRetry(contract: Contract): Promise { + for (const dep of contract.dependsOn) { + await gates.get(dep)?.promise; + if (escalatedRoles.has(dep)) { + const lastError = `Dependency '${dep}' escalated before this tenant could start`; + escalatedRoles.add(contract.role); + tenantOutcomes[contract.role] = { status: 'escalated', lastError, retriesExhausted: 0 }; + gates.get(contract.role)?.resolve({}); + config.onEvent?.({ type: 'tenant_escalated', role: contract.role }); + return; + } + } + + const sharedArtifacts: Record = {}; + for (const dep of contract.dependsOn) { + const depArtifacts = jobArtifacts[dep] ?? {}; + for (const [k, v] of Object.entries(depArtifacts)) { + sharedArtifacts[`${dep}.${k}`] = v; + } + } + + const workDir = join(baseOutputDir, contract.role); + await mkdir(workDir, { recursive: true }); + config.onEvent?.({ type: 'tenant_started', role: contract.role }); + + let lastError: string | undefined; + for (let attempt = 0; attempt < contract.maxRetries; attempt++) { + const result = await runTenant( + contract, + toolsFactory(workDir), + { + adapter: config.adapter, + model: config.tenantModel, + ...(config.budget !== undefined ? { budget: config.budget } : {}), + workDir, + }, + lastError, + Object.keys(sharedArtifacts).length > 0 ? sharedArtifacts : undefined, + ); + + if (result.ok) { + jobArtifacts[contract.role] = result.value; + tenantOutcomes[contract.role] = { status: 'complete', artifacts: result.value }; + gates.get(contract.role)?.resolve(result.value); + config.onEvent?.({ type: 'tenant_complete', role: contract.role }); + return; + } + + lastError = result.error.message; + config.onEvent?.({ + type: 'tenant_evicted', + role: contract.role, + reason: lastError, + retry: attempt + 1, + }); + } + + escalatedRoles.add(contract.role); + tenantOutcomes[contract.role] = { + status: 'escalated', + lastError: lastError ?? 'unknown', + retriesExhausted: contract.maxRetries, + }; + gates.get(contract.role)?.resolve({}); + config.onEvent?.({ type: 'tenant_escalated', role: contract.role }); + } + + await wf.parallel(plan.map((c) => () => runWithRetry(c))); + + const allComplete = Object.values(tenantOutcomes).every((o) => o.status === 'complete'); + const status: 'complete' | 'partial' = allComplete ? 'complete' : 'partial'; + config.onEvent?.({ type: 'job_complete', artifacts: jobArtifacts }); + return { status, tenants: tenantOutcomes, artifacts: jobArtifacts }; + }, + }); + + const runResult = await runWorkflow(module, { + adapter: config.adapter, + models: { default: config.tenantModel }, + ...(config.budget !== undefined ? { budget: config.budget } : {}), + baseDir: baseOutputDir, + }); + if (!runResult.ok) return runResult; + return { ok: true, value: runResult.value.result as OrchestrateResult }; +} +``` + +- [ ] **Step 3: Update `src/index.ts`** + +Add this line at the end of the existing exports (keep all current exports intact): + +```ts +export * from './workflow/index.ts'; +``` + +- [ ] **Step 4: Update `package.json` exports** + +Add a `./workflow` entry to the `exports` map (after `./tools`): + +```json +"./workflow": { "types": "./dist/workflow/index.d.ts", "import": "./dist/workflow/index.js" } +``` + +- [ ] **Step 5: Update `tsup.config.ts`** + +Change the `entry` array to include the workflow entry: + +```ts +entry: ['src/index.ts', 'src/tools/index.ts', 'src/workflow/index.ts'], +``` + +- [ ] **Step 6: Run backward-compat + typecheck + build** + +Run: `pnpm typecheck && pnpm vitest run test/orchestrate.test.ts test/tenant.test.ts test/decompose.test.ts test/validate.test.ts test/contract.test.ts && pnpm build` +Expected: typecheck clean; all 5 existing suites PASS; build emits `dist/workflow/index.js`. + +- [ ] **Step 7: Commit** + +```bash +git add packages/landlord/src/workflow/index.ts packages/landlord/src/orchestrate.ts packages/landlord/src/index.ts packages/landlord/package.json packages/landlord/tsup.config.ts +git commit -m "feat(landlord): rebuild orchestrate on runtime; export workflow surface" +``` + +--- + +## Task 16: Docs, changeset, and full verification + +**Files:** +- Create: `packages/landlord/README.md` is not required; docs live in `docs/`. +- Create: `docs/landlord/workflow.md`, `docs/landlord/hooks.md`, `docs/landlord/resume.md`, `docs/landlord/agent-types.md`, `docs/landlord/isolation.md`, `docs/landlord/workflow-tool.md` +- Create: `docs/examples/dynamic-workflow.md` +- Modify: `docs/landlord/index.md` (add the workflow-runtime mental model), `docs/landlord/orchestrate.md` (note it is runtime-backed), `docs/.vitepress/config.ts` (sidebar/nav), `README.md` (Landlord bullet) +- Create: `.changeset/landlord-dynamic-workflows.md` + +> Docs are prose, not TDD. Write real, runnable TypeScript in every snippet against the API built in Tasks 1–15. Each page ends with a "See also" section (project doc norm). Keep tone developer-to-developer. + +- [ ] **Step 1: Write `docs/landlord/workflow.md`** + +Cover: the mental model (a workflow is a script/typed function that drives subagents via hooks); a quick start with both `runWorkflowScript()` (string) and `defineWorkflow()` (typed); the full `RuntimeConfig` field table (`adapter`, `models`, `args`, `budget`, `tokenTarget`, `registry`, `workflows`, `journal`, `isolation`, `worktreeRepoDir`, `baseDir`, `concurrency`, `agentCap`, `onEvent`, `signal`, `runId`, `resumeFromRunId`); the `WorkflowEvent` catalog; and a comparison table "workflow runtime vs orchestrate() vs @flint/graph vs agent()". Use the review→verify pipeline as the worked example. + +- [ ] **Step 2: Write `docs/landlord/hooks.md`** + +Full reference for `agent`, `parallel`, `pipeline`, `phase`, `log`, `args`, `budget`, `workflow`, with the exact signatures from `src/workflow/types.ts`, the barrier-vs-no-barrier distinction, the `AgentOpts` field table (`label`, `phase`, `schema`, `model`, `isolation`, `agentType`), and the structured-output retry behavior. + +- [ ] **Step 3: Write `docs/landlord/resume.md`** + +Explain journaling: `JournalStore`, `memoryJournalStore()` vs `fileJournalStore(dir)`, how `resumeFromRunId` replays the longest unchanged prefix, the determinism requirement (sandbox blocks `Date`/`Math.random` in string scripts; typed workflows must avoid nondeterminism), and a runnable resume example. + +- [ ] **Step 4: Write `docs/landlord/agent-types.md`** + +Document `createAgentRegistry()`, the three built-ins (`default`, `Explore`, `code-reviewer`) with their toolsets and prompts, how `agentType` composes with `schema`, and how to register custom types. + +- [ ] **Step 5: Write `docs/landlord/isolation.md`** + +Document `IsolationBackend`, `workdirIsolation()` (default, per-agent sandboxed dir), `gitWorktreeIsolation()` (via `worktreeRepoDir`, fallback behavior outside a repo), and when to use `isolation: 'worktree'`. + +- [ ] **Step 6: Write `docs/landlord/workflow-tool.md`** + +Document `workflowTool()`, `WorkflowToolConfig`, `orchestratorAgent()`, and `WORKFLOW_TOOL_GUIDE` — how to give an `agent()` the ability to author-and-run workflows itself, with a runnable example. + +- [ ] **Step 7: Write `docs/examples/dynamic-workflow.md`** + +A complete worked example: a review→verify pipeline shown both as a string script and the equivalent `defineWorkflow()`, with `onEvent` logging and the printed result. + +- [ ] **Step 8: Update `docs/landlord/index.md`, `docs/landlord/orchestrate.md`, `docs/.vitepress/config.ts`, `README.md`** + +- In `index.md`: add a short "Two ways to orchestrate" section — the new script-driven workflow runtime (headline) and the original auto-decompose `orchestrate()` (now a built-in workflow on the runtime). +- In `orchestrate.md`: add a callout that `orchestrate()` now runs on the workflow runtime; behavior and API are unchanged. +- In `config.ts`: add the new pages to the `'/landlord/'` sidebar (`Workflows`, `Hooks`, `Resume`, `Agent Types`, `Isolation`, `Workflow Tool`) and add `Dynamic Workflow` to the examples sidebar. +- In `README.md`: extend the Landlord line in `## Packages` / the docs list to mention "dynamic workflow runtime (ultracode-style script orchestration)". + +- [ ] **Step 9: Write the changeset** + +```md +// .changeset/landlord-dynamic-workflows.md +--- +"landlord": minor +--- + +Add a dynamic-workflow runtime: author workflows as typed functions (`defineWorkflow`) or model-written JS scripts (`runWorkflowScript`) that orchestrate subagents via `agent`/`parallel`/`pipeline`/`phase`/`log`/`args`/`budget`/`workflow` hooks, with structured-output schemas, concurrency/agent caps, resume/journaling, a determinism sandbox, an agent-type registry, isolation backends, and a model-facing `workflowTool`. `orchestrate()` is now built on this runtime (API unchanged). +``` + +- [ ] **Step 10: Full verification** + +Run: `pnpm test && pnpm typecheck && pnpm lint && pnpm build` +(from the repo root, or `pnpm -C packages/landlord ...` then root docs build `pnpm docs:build`) +Expected: all workflow + existing suites PASS; typecheck clean; Biome lint clean; build emits all three entries; `pnpm docs:build` succeeds. + +- [ ] **Step 11: Commit** + +```bash +git add docs/ README.md .changeset/landlord-dynamic-workflows.md +git commit -m "docs(landlord): dynamic-workflow runtime docs, example, and changeset" +``` + +--- + +## Self-Review (completed during planning) + +**Spec coverage:** every spec section maps to a task — hooks/types (T1, T9, T10), concurrency+caps (T2), budget bridge (T3), events (T4), resume/journaling (T5, T13), agentType registry + built-ins (T6), isolation backends (T7), structured-output schema (T8), determinism sandbox + meta (T11), string+typed authoring (T12), run engine + nesting (T13), workflowTool + guide (T14), orchestrate rebuild + packaging (T15), docs + changeset (T16). The harness-coupled non-goals (TUI tree, background notifications, MCP ToolSearch) are intentionally mapped to `onEvent`/`AbortSignal`/explicit tool passing per spec §13. + +**Type consistency:** `RunDeps` is defined once (T9) and consumed by `hooks.ts`/`runtime.ts` via type-only import (no runtime cycle). `WorkflowContext`, `AgentOpts`, `Models`, `Meta` come from `types.ts` (T1) throughout. `makeStructuredOutput` returns `{ tool, getValue }` (T8) used verbatim in T9. `RuntimeConfig.models` is `{ default, … }` and `orchestrate()` maps `tenantModel → models.default` (T15), matching the spec's resolved model order `opts.model ?? preset.model ?? config.models.default` (T9). + +**Placeholder scan:** no TBD/TODO; every code step contains complete, compilable code; every run step has an exact command and expected outcome. + +**Known risk + mitigation:** the `orchestrate()` rebuild (T15) is the only change to shipped behavior — mitigated by reusing `runTenant`/gates/retry verbatim and gating the task on the five existing suites passing unchanged (T15 Step 6). + +--- + +## Execution Handoff + +**Plan complete and saved to `docs/superpowers/plans/2026-05-31-landlord-dynamic-workflows.md`. Two execution options:** + +**1. Subagent-Driven (recommended)** — a fresh subagent per task, two-stage review between tasks, fast iteration. + +**2. Inline Execution** — execute tasks in this session with batch checkpoints for review. + +**Which approach?** + From ecd227e88a82f730e1a2447ffd969c14876c1080 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:22:53 -0600 Subject: [PATCH 03/22] feat(landlord): workflow errors and shared types --- packages/landlord/src/workflow/errors.ts | 23 ++++++ packages/landlord/src/workflow/types.ts | 71 +++++++++++++++++++ packages/landlord/test/workflow/types.test.ts | 17 +++++ 3 files changed, 111 insertions(+) create mode 100644 packages/landlord/src/workflow/errors.ts create mode 100644 packages/landlord/src/workflow/types.ts create mode 100644 packages/landlord/test/workflow/types.test.ts diff --git a/packages/landlord/src/workflow/errors.ts b/packages/landlord/src/workflow/errors.ts new file mode 100644 index 0000000..345831c --- /dev/null +++ b/packages/landlord/src/workflow/errors.ts @@ -0,0 +1,23 @@ +// src/workflow/errors.ts +import { FlintError } from 'flint/errors'; + +export class WorkflowError extends FlintError { + constructor(message: string, code: string, cause?: unknown) { + super(message, { code, ...(cause !== undefined ? { cause } : {}) }); + this.name = 'WorkflowError'; + } +} + +export class AgentCapError extends WorkflowError { + constructor(message: string) { + super(message, 'workflow.agent_cap'); + this.name = 'AgentCapError'; + } +} + +export class MetaError extends WorkflowError { + constructor(message: string) { + super(message, 'workflow.meta'); + this.name = 'MetaError'; + } +} diff --git a/packages/landlord/src/workflow/types.ts b/packages/landlord/src/workflow/types.ts new file mode 100644 index 0000000..b5d9824 --- /dev/null +++ b/packages/landlord/src/workflow/types.ts @@ -0,0 +1,71 @@ +// src/workflow/types.ts +import type { ProviderAdapter, Tool } from 'flint'; + +export type Models = { default: string } & Record; + +export type AgentOpts = { + label?: string; + phase?: string; + schema?: Record; + model?: string; + isolation?: 'worktree'; + agentType?: string; +}; + +export type StageFn = ( + prev: unknown, + originalItem: unknown, + index: number, +) => unknown | Promise; + +export type WorkflowBudgetView = { + total: number | null; + spent: () => number; + remaining: () => number; +}; + +export type WorkflowContext = { + agent: (prompt: string, opts?: AgentOpts) => Promise; + parallel: (thunks: Array<() => Promise>) => Promise>; + pipeline: (items: unknown[], ...stages: StageFn[]) => Promise; + phase: (title: string) => void; + log: (message: string) => void; + args: unknown; + budget: WorkflowBudgetView; + workflow: ( + ref: string | { scriptPath?: string; source?: string }, + args?: unknown, + ) => Promise; +}; + +export type WorkflowEvent = + | { type: 'phase_started'; title: string } + | { type: 'log'; message: string } + | { type: 'agent_started'; label: string; phase?: string; agentType: string; model: string } + | { type: 'agent_complete'; label: string; phase?: string; tokens: number } + | { type: 'agent_error'; label: string; phase?: string; error: string } + | { type: 'workflow_complete'; result: unknown }; + +export type MetaPhase = { title: string; detail?: string; model?: string }; + +export type Meta = { + name: string; + description: string; + whenToUse?: string; + model?: string; + phases?: MetaPhase[]; +}; + +export type WorkflowModule = { + meta: Meta; + run: (wf: WorkflowContext) => Promise; +}; + +export type WorkflowRunResult = { + runId: string; + result: unknown; + events: WorkflowEvent[]; +}; + +// Re-exported here so consumers can build tool registries without importing flint directly. +export type { ProviderAdapter, Tool }; diff --git a/packages/landlord/test/workflow/types.test.ts b/packages/landlord/test/workflow/types.test.ts new file mode 100644 index 0000000..4df6ebe --- /dev/null +++ b/packages/landlord/test/workflow/types.test.ts @@ -0,0 +1,17 @@ +// test/workflow/types.test.ts +import { describe, expect, it } from 'vitest'; +import { AgentCapError, MetaError, WorkflowError } from '../../src/workflow/errors.ts'; + +describe('workflow errors', () => { + it('WorkflowError carries a code and name', () => { + const e = new WorkflowError('boom', 'workflow.test'); + expect(e.code).toBe('workflow.test'); + expect(e.name).toBe('WorkflowError'); + expect(e).toBeInstanceOf(Error); + }); + + it('AgentCapError and MetaError have fixed codes', () => { + expect(new AgentCapError('x').code).toBe('workflow.agent_cap'); + expect(new MetaError('x').code).toBe('workflow.meta'); + }); +}); From a19c7055f7412fb2335ce8c68bbb4088d9995466 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:25:59 -0600 Subject: [PATCH 04/22] feat(landlord): concurrency semaphore and agent cap --- packages/landlord/src/workflow/concurrency.ts | 50 +++++++++++++++++++ .../test/workflow/concurrency.test.ts | 40 +++++++++++++++ 2 files changed, 90 insertions(+) create mode 100644 packages/landlord/src/workflow/concurrency.ts create mode 100644 packages/landlord/test/workflow/concurrency.test.ts diff --git a/packages/landlord/src/workflow/concurrency.ts b/packages/landlord/src/workflow/concurrency.ts new file mode 100644 index 0000000..4184be3 --- /dev/null +++ b/packages/landlord/src/workflow/concurrency.ts @@ -0,0 +1,50 @@ +// src/workflow/concurrency.ts +import { cpus } from 'node:os'; +import { AgentCapError } from './errors.ts'; + +export function defaultConcurrency(): number { + return Math.max(1, Math.min(16, cpus().length - 2)); +} + +/** + * Counting semaphore. The fast path (slot available) runs synchronously up to + * `active++`, so concurrent synchronous `acquire()` calls cannot oversubscribe. + */ +export class Semaphore { + private active = 0; + private readonly waiters: Array<() => void> = []; + + constructor(private readonly limit: number) {} + + async acquire(): Promise<() => void> { + if (this.active >= this.limit) { + await new Promise((resolve) => this.waiters.push(resolve)); + } + this.active++; + let released = false; + return () => { + if (released) return; + released = true; + this.active--; + const next = this.waiters.shift(); + if (next) next(); + }; + } +} + +export class AgentCounter { + private count = 0; + + constructor(private readonly cap: number = 1000) {} + + increment(): void { + this.count += 1; + if (this.count > this.cap) { + throw new AgentCapError(`Workflow exceeded the ${this.cap}-agent cap`); + } + } + + get value(): number { + return this.count; + } +} diff --git a/packages/landlord/test/workflow/concurrency.test.ts b/packages/landlord/test/workflow/concurrency.test.ts new file mode 100644 index 0000000..0843b46 --- /dev/null +++ b/packages/landlord/test/workflow/concurrency.test.ts @@ -0,0 +1,40 @@ +// test/workflow/concurrency.test.ts +import { describe, expect, it } from 'vitest'; +import { AgentCapError } from '../../src/workflow/errors.ts'; +import { AgentCounter, Semaphore, defaultConcurrency } from '../../src/workflow/concurrency.ts'; + +describe('Semaphore', () => { + it('never runs more than `limit` tasks concurrently', async () => { + const sem = new Semaphore(2); + let active = 0; + let peak = 0; + const task = async () => { + const release = await sem.acquire(); + active++; + peak = Math.max(peak, active); + await new Promise((r) => setTimeout(r, 5)); + active--; + release(); + }; + await Promise.all(Array.from({ length: 8 }, () => task())); + expect(peak).toBeLessThanOrEqual(2); + }); +}); + +describe('AgentCounter', () => { + it('throws AgentCapError past the cap', () => { + const c = new AgentCounter(3); + c.increment(); + c.increment(); + c.increment(); + expect(() => c.increment()).toThrow(AgentCapError); + }); +}); + +describe('defaultConcurrency', () => { + it('is at least 1 and at most 16', () => { + const n = defaultConcurrency(); + expect(n).toBeGreaterThanOrEqual(1); + expect(n).toBeLessThanOrEqual(16); + }); +}); From 1db2fa0c071feb5b2e7087707a5dc186490e12c5 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:28:54 -0600 Subject: [PATCH 05/22] feat(landlord): workflow budget token-target tracker --- packages/landlord/src/workflow/budget.ts | 37 +++++++++++++++++++ .../landlord/test/workflow/budget.test.ts | 29 +++++++++++++++ 2 files changed, 66 insertions(+) create mode 100644 packages/landlord/src/workflow/budget.ts create mode 100644 packages/landlord/test/workflow/budget.test.ts diff --git a/packages/landlord/src/workflow/budget.ts b/packages/landlord/src/workflow/budget.ts new file mode 100644 index 0000000..85574c2 --- /dev/null +++ b/packages/landlord/src/workflow/budget.ts @@ -0,0 +1,37 @@ +// src/workflow/budget.ts +import type { WorkflowBudgetView } from './types.ts'; + +/** + * Tracks the run's output-token spend against an optional target (the ultracode + * "+500k"-style ceiling). `total === null` means no target → unbounded remaining. + */ +export class WorkflowBudget { + private outputTokens = 0; + readonly total: number | null; + + constructor(total: number | null) { + this.total = total; + } + + record(usage: { input?: number; output?: number; cached?: number }): void { + this.outputTokens += usage.output ?? 0; + } + + spent(): number { + return this.outputTokens; + } + + remaining(): number { + return this.total === null + ? Number.POSITIVE_INFINITY + : Math.max(0, this.total - this.outputTokens); + } +} + +export function budgetView(wb: WorkflowBudget): WorkflowBudgetView { + return { + total: wb.total, + spent: () => wb.spent(), + remaining: () => wb.remaining(), + }; +} diff --git a/packages/landlord/test/workflow/budget.test.ts b/packages/landlord/test/workflow/budget.test.ts new file mode 100644 index 0000000..7a4652d --- /dev/null +++ b/packages/landlord/test/workflow/budget.test.ts @@ -0,0 +1,29 @@ +// test/workflow/budget.test.ts +import { describe, expect, it } from 'vitest'; +import { WorkflowBudget, budgetView } from '../../src/workflow/budget.ts'; + +describe('WorkflowBudget', () => { + it('tracks spent output tokens and computes remaining against a target', () => { + const wb = new WorkflowBudget(100); + wb.record({ input: 10, output: 30 }); + wb.record({ input: 5, output: 20 }); + expect(wb.spent()).toBe(50); + expect(wb.remaining()).toBe(50); + }); + + it('remaining is Infinity when total is null', () => { + const wb = new WorkflowBudget(null); + wb.record({ output: 1000 }); + expect(wb.spent()).toBe(1000); + expect(wb.remaining()).toBe(Number.POSITIVE_INFINITY); + }); + + it('budgetView exposes total/spent/remaining bound to the instance', () => { + const wb = new WorkflowBudget(10); + const view = budgetView(wb); + wb.record({ output: 4 }); + expect(view.total).toBe(10); + expect(view.spent()).toBe(4); + expect(view.remaining()).toBe(6); + }); +}); From 3cf03aff7a3fa39b5e9dcb25ea5a0bce4d7217fa Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:30:47 -0600 Subject: [PATCH 06/22] feat(landlord): workflow event emitter --- packages/landlord/src/workflow/events.ts | 18 ++++++++++++++++ .../landlord/test/workflow/events.test.ts | 21 +++++++++++++++++++ 2 files changed, 39 insertions(+) create mode 100644 packages/landlord/src/workflow/events.ts create mode 100644 packages/landlord/test/workflow/events.test.ts diff --git a/packages/landlord/src/workflow/events.ts b/packages/landlord/src/workflow/events.ts new file mode 100644 index 0000000..ba1bf08 --- /dev/null +++ b/packages/landlord/src/workflow/events.ts @@ -0,0 +1,18 @@ +import type { WorkflowEvent } from './types.ts'; + +export type EventSink = (event: WorkflowEvent) => void; + +export class EventEmitter { + private readonly events: WorkflowEvent[] = []; + + constructor(private readonly sink?: EventSink) {} + + emit(event: WorkflowEvent): void { + this.events.push(event); + this.sink?.(event); + } + + all(): WorkflowEvent[] { + return this.events; + } +} diff --git a/packages/landlord/test/workflow/events.test.ts b/packages/landlord/test/workflow/events.test.ts new file mode 100644 index 0000000..00e57cb --- /dev/null +++ b/packages/landlord/test/workflow/events.test.ts @@ -0,0 +1,21 @@ +// test/workflow/events.test.ts +import { describe, expect, it } from 'vitest'; +import { EventEmitter } from '../../src/workflow/events.ts'; +import type { WorkflowEvent } from '../../src/workflow/types.ts'; + +describe('EventEmitter', () => { + it('records events and forwards them to the sink', () => { + const seen: WorkflowEvent[] = []; + const em = new EventEmitter((e) => seen.push(e)); + em.emit({ type: 'log', message: 'hi' }); + em.emit({ type: 'phase_started', title: 'Find' }); + expect(seen).toHaveLength(2); + expect(em.all().map((e) => e.type)).toEqual(['log', 'phase_started']); + }); + + it('works with no sink', () => { + const em = new EventEmitter(); + em.emit({ type: 'log', message: 'x' }); + expect(em.all()).toHaveLength(1); + }); +}); From 47f800a61676c74d4715f471613a25390b64af4c Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:32:57 -0600 Subject: [PATCH 07/22] feat(landlord): journal store and call hashing for resume --- packages/landlord/src/workflow/journal.ts | 67 +++++++++++++++++++ .../landlord/test/workflow/journal.test.ts | 37 ++++++++++ 2 files changed, 104 insertions(+) create mode 100644 packages/landlord/src/workflow/journal.ts create mode 100644 packages/landlord/test/workflow/journal.test.ts diff --git a/packages/landlord/src/workflow/journal.ts b/packages/landlord/src/workflow/journal.ts new file mode 100644 index 0000000..1b0e9fc --- /dev/null +++ b/packages/landlord/src/workflow/journal.ts @@ -0,0 +1,67 @@ +// src/workflow/journal.ts +import { appendFile, mkdir, readFile } from 'node:fs/promises'; +import { join } from 'node:path'; + +export type JournalEntry = { index: number; hash: string; result: unknown }; + +export interface JournalStore { + append(runId: string, entry: JournalEntry): Promise; + load(runId: string): Promise; +} + +function stableStringify(value: unknown): string { + if (value === null || typeof value !== 'object') return JSON.stringify(value) ?? 'null'; + if (Array.isArray(value)) return `[${value.map(stableStringify).join(',')}]`; + const keys = Object.keys(value as Record).sort(); + const body = keys + .map((k) => `${JSON.stringify(k)}:${stableStringify((value as Record)[k])}`) + .join(','); + return `{${body}}`; +} + +/** FNV-1a (32-bit) hex of the stable-stringified call signature. */ +export function hashCall(prompt: string, opts: unknown): string { + const input = stableStringify({ prompt, opts: opts ?? {} }); + let h = 0x811c9dc5; + for (let i = 0; i < input.length; i++) { + h ^= input.charCodeAt(i); + h = Math.imul(h, 0x01000193); + } + return (h >>> 0).toString(16).padStart(8, '0'); +} + +export function memoryJournalStore(): JournalStore { + const runs = new Map(); + return { + async append(runId, entry) { + const list = runs.get(runId) ?? []; + list.push(entry); + runs.set(runId, list); + }, + async load(runId) { + return [...(runs.get(runId) ?? [])]; + }, + }; +} + +export function fileJournalStore(dir: string): JournalStore { + const path = (runId: string) => join(dir, `journal-${runId}.jsonl`); + return { + async append(runId, entry) { + await mkdir(dir, { recursive: true }); + await appendFile(path(runId), `${JSON.stringify(entry)}\n`, 'utf-8'); + }, + async load(runId) { + let text: string; + try { + text = await readFile(path(runId), 'utf-8'); + } catch { + return []; + } + return text + .split('\n') + .filter((l) => l.trim().length > 0) + .map((l) => JSON.parse(l) as JournalEntry); + }, + }; +} diff --git a/packages/landlord/test/workflow/journal.test.ts b/packages/landlord/test/workflow/journal.test.ts new file mode 100644 index 0000000..a6b3228 --- /dev/null +++ b/packages/landlord/test/workflow/journal.test.ts @@ -0,0 +1,37 @@ +// test/workflow/journal.test.ts +import { mkdtemp } from 'node:fs/promises'; +import { tmpdir } from 'node:os'; +import { join } from 'node:path'; +import { describe, expect, it } from 'vitest'; +import { fileJournalStore, hashCall, memoryJournalStore } from '../../src/workflow/journal.ts'; + +describe('hashCall', () => { + it('is stable regardless of opts key order', () => { + const a = hashCall('p', { label: 'x', phase: 'y' }); + const b = hashCall('p', { phase: 'y', label: 'x' }); + expect(a).toBe(b); + }); + it('changes when the prompt changes', () => { + expect(hashCall('a', {})).not.toBe(hashCall('b', {})); + }); +}); + +describe('memoryJournalStore', () => { + it('appends and loads entries in order', async () => { + const s = memoryJournalStore(); + await s.append('run1', { index: 0, hash: 'h0', result: 'r0' }); + await s.append('run1', { index: 1, hash: 'h1', result: 'r1' }); + const entries = await s.load('run1'); + expect(entries.map((e) => e.result)).toEqual(['r0', 'r1']); + }); +}); + +describe('fileJournalStore', () => { + it('round-trips entries through JSONL', async () => { + const dir = await mkdtemp(join(tmpdir(), 'jrnl-')); + const s = fileJournalStore(dir); + await s.append('runA', { index: 0, hash: 'h', result: { ok: true } }); + const entries = await s.load('runA'); + expect(entries).toEqual([{ index: 0, hash: 'h', result: { ok: true } }]); + }); +}); From 9473964887562695a167dc87adfc950aca51fbc5 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:35:38 -0600 Subject: [PATCH 08/22] feat(landlord): agent-type and workflow registries --- packages/landlord/src/workflow/registry.ts | 61 +++++++++++++++++++ .../landlord/test/workflow/registry.test.ts | 28 +++++++++ 2 files changed, 89 insertions(+) create mode 100644 packages/landlord/src/workflow/registry.ts create mode 100644 packages/landlord/test/workflow/registry.test.ts diff --git a/packages/landlord/src/workflow/registry.ts b/packages/landlord/src/workflow/registry.ts new file mode 100644 index 0000000..d22bfb8 --- /dev/null +++ b/packages/landlord/src/workflow/registry.ts @@ -0,0 +1,61 @@ +// src/workflow/registry.ts +import type { Tool } from 'flint'; +import { bashTool, fileReadTool, standardTools, webFetchTool } from '../tools/index.ts'; +import { WorkflowError } from './errors.ts'; + +export type AgentType = { + systemPrompt: string; + tools?: (workDir: string) => Tool[]; + model?: string; +}; + +export type AgentTypeRegistry = { + resolve(name: string): AgentType; + has(name: string): boolean; +}; + +export const BUILT_IN_AGENT_TYPES: Record = { + default: { + systemPrompt: + 'You are a focused worker agent. Use your tools to accomplish the task. ' + + 'When a structured result is requested, return it by calling the structured_output tool.', + tools: (workDir) => standardTools(workDir), + }, + Explore: { + systemPrompt: + 'You are a read-only exploration agent. Search broadly, read excerpts rather than whole ' + + 'files, and return conclusions — never modify anything. You have read and web tools only.', + tools: (workDir) => [fileReadTool(workDir), webFetchTool(workDir)], + }, + 'code-reviewer': { + systemPrompt: + 'You are a code reviewer. Read the relevant code and report concrete issues (bugs, security, ' + + 'quality) with file and line references. Return findings via structured_output when asked.', + tools: (workDir) => [fileReadTool(workDir), bashTool(workDir)], + }, +}; + +export function createAgentRegistry(custom?: Record): AgentTypeRegistry { + const merged: Record = { ...BUILT_IN_AGENT_TYPES, ...(custom ?? {}) }; + return { + has: (name) => name in merged, + resolve: (name) => { + const t = merged[name]; + if (t === undefined) { + throw new WorkflowError( + `Unknown agentType '${name}'. Known: ${Object.keys(merged).join(', ')}`, + 'workflow.unknown_agent_type', + ); + } + return t; + }, + }; +} + +export type WorkflowRegistry = { + resolve(name: string): string | undefined; +}; + +export function createWorkflowRegistry(scripts: Record): WorkflowRegistry { + return { resolve: (name) => scripts[name] }; +} diff --git a/packages/landlord/test/workflow/registry.test.ts b/packages/landlord/test/workflow/registry.test.ts new file mode 100644 index 0000000..a11a4dc --- /dev/null +++ b/packages/landlord/test/workflow/registry.test.ts @@ -0,0 +1,28 @@ +// test/workflow/registry.test.ts +import { describe, expect, it } from 'vitest'; +import { WorkflowError } from '../../src/workflow/errors.ts'; +import { createAgentRegistry, createWorkflowRegistry } from '../../src/workflow/registry.ts'; + +describe('createAgentRegistry', () => { + it('resolves built-in types', () => { + const reg = createAgentRegistry(); + expect(reg.has('default')).toBe(true); + expect(reg.has('Explore')).toBe(true); + expect(reg.has('code-reviewer')).toBe(true); + expect(reg.resolve('default').tools?.('/tmp/x').length).toBeGreaterThan(0); + }); + + it('merges custom types over built-ins and throws on unknown', () => { + const reg = createAgentRegistry({ custom: { systemPrompt: 'You are custom.' } }); + expect(reg.resolve('custom').systemPrompt).toBe('You are custom.'); + expect(() => reg.resolve('missing')).toThrow(WorkflowError); + }); +}); + +describe('createWorkflowRegistry', () => { + it('resolves named sources', () => { + const reg = createWorkflowRegistry({ greet: 'return "hi"' }); + expect(reg.resolve('greet')).toBe('return "hi"'); + expect(reg.resolve('nope')).toBeUndefined(); + }); +}); From eb3f61a58f261e4cf95b72482c142c57eebc3963 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:38:26 -0600 Subject: [PATCH 09/22] feat(landlord): workdir and git-worktree isolation backends --- packages/landlord/src/workflow/isolation.ts | 59 +++++++++++++++++++ .../landlord/test/workflow/isolation.test.ts | 30 ++++++++++ 2 files changed, 89 insertions(+) create mode 100644 packages/landlord/src/workflow/isolation.ts create mode 100644 packages/landlord/test/workflow/isolation.test.ts diff --git a/packages/landlord/src/workflow/isolation.ts b/packages/landlord/src/workflow/isolation.ts new file mode 100644 index 0000000..0aca64f --- /dev/null +++ b/packages/landlord/src/workflow/isolation.ts @@ -0,0 +1,59 @@ +import { exec } from 'node:child_process'; +import { mkdir } from 'node:fs/promises'; +import { join } from 'node:path'; +import { promisify } from 'node:util'; + +const execAsync = promisify(exec); + +export type IsolationLease = { workDir: string; release: () => Promise }; + +export interface IsolationBackend { + acquire(label: string): Promise; +} + +function sanitize(label: string): string { + return label.replace(/[^a-zA-Z0-9_-]/g, '_').slice(0, 40) || 'agent'; +} + +export function workdirIsolation(baseDir: string): IsolationBackend { + let counter = 0; + return { + async acquire(label) { + const workDir = join(baseDir, `${sanitize(label)}-${counter++}`); + await mkdir(workDir, { recursive: true }); + return { workDir, release: async () => {} }; + }, + }; +} + +export function gitWorktreeIsolation(repoDir: string, baseDir: string): IsolationBackend { + const fallback = workdirIsolation(baseDir); + let counter = 0; + return { + async acquire(label) { + try { + await execAsync('git rev-parse --is-inside-work-tree', { cwd: repoDir }); + } catch { + return fallback.acquire(label); + } + const workDir = join(baseDir, `wt-${sanitize(label)}-${counter++}`); + try { + await execAsync(`git worktree add --detach ${JSON.stringify(workDir)}`, { cwd: repoDir }); + } catch { + return fallback.acquire(label); + } + return { + workDir, + release: async () => { + try { + await execAsync(`git worktree remove --force ${JSON.stringify(workDir)}`, { + cwd: repoDir, + }); + } catch { + /* leave the worktree for inspection if removal fails */ + } + }, + }; + }, + }; +} diff --git a/packages/landlord/test/workflow/isolation.test.ts b/packages/landlord/test/workflow/isolation.test.ts new file mode 100644 index 0000000..2d3e574 --- /dev/null +++ b/packages/landlord/test/workflow/isolation.test.ts @@ -0,0 +1,30 @@ +// test/workflow/isolation.test.ts +import { mkdtemp, stat } from 'node:fs/promises'; +import { tmpdir } from 'node:os'; +import { join } from 'node:path'; +import { describe, expect, it } from 'vitest'; +import { gitWorktreeIsolation, workdirIsolation } from '../../src/workflow/isolation.ts'; + +describe('workdirIsolation', () => { + it('creates a distinct existing directory per acquire', async () => { + const base = await mkdtemp(join(tmpdir(), 'iso-')); + const backend = workdirIsolation(base); + const a = await backend.acquire('alpha'); + const b = await backend.acquire('alpha'); + expect(a.workDir).not.toBe(b.workDir); + expect((await stat(a.workDir)).isDirectory()).toBe(true); + await a.release(); + await b.release(); + }); +}); + +describe('gitWorktreeIsolation', () => { + it('falls back to a workdir lease outside a git repo', async () => { + const base = await mkdtemp(join(tmpdir(), 'iso2-')); + const notRepo = await mkdtemp(join(tmpdir(), 'norepo-')); + const backend = gitWorktreeIsolation(notRepo, base); + const lease = await backend.acquire('w'); + expect((await stat(lease.workDir)).isDirectory()).toBe(true); + await lease.release(); + }); +}); From ff743d64107fab24a9055e30c9cacbbd9342162e Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:41:47 -0600 Subject: [PATCH 10/22] feat(landlord): structured-output tool with ajv validation Co-Authored-By: Claude Sonnet 4.6 --- packages/landlord/src/workflow/schema.ts | 68 +++++++++++++++++++ .../landlord/test/workflow/schema.test.ts | 35 ++++++++++ 2 files changed, 103 insertions(+) create mode 100644 packages/landlord/src/workflow/schema.ts create mode 100644 packages/landlord/test/workflow/schema.test.ts diff --git a/packages/landlord/src/workflow/schema.ts b/packages/landlord/src/workflow/schema.ts new file mode 100644 index 0000000..2054a21 --- /dev/null +++ b/packages/landlord/src/workflow/schema.ts @@ -0,0 +1,68 @@ +import Ajv from 'ajv'; +import { tool } from 'flint'; +import type { StandardSchemaV1, Tool } from 'flint'; + +const ajv = new Ajv({ allErrors: true }); + +function anyObjectSchema(): StandardSchemaV1> { + return { + '~standard': { + version: 1, + vendor: 'landlord', + validate: (v) => { + if (typeof v !== 'object' || v === null || Array.isArray(v)) { + return { issues: [{ message: 'Expected an object' }] }; + } + return { value: v as Record }; + }, + }, + }; +} + +export type StructuredOutput = { + tool: Tool; + getValue: () => unknown; +}; + +/** + * Build a forced `structured_output` tool for an `agent()` call. Object schemas + * are presented as-is; non-object schemas are wrapped under a `result` key and + * unwrapped on capture. The handler validates with ajv and returns a corrective + * message on mismatch so the agent loop retries. + */ +export function makeStructuredOutput(schema: Record): StructuredOutput { + const wrapped = schema['type'] !== 'object'; + const jsonSchema: Record = wrapped + ? { type: 'object', properties: { result: schema }, required: ['result'] } + : schema; + + let validate: ReturnType; + try { + validate = ajv.compile(jsonSchema); + } catch { + validate = ajv.compile({ type: 'object' }); + } + + let captured: unknown; + let done = false; + + const t = tool({ + name: 'structured_output', + description: + 'Return your final result as JSON matching the required schema. Call this exactly once.', + input: anyObjectSchema(), + jsonSchema, + handler: (input: Record) => { + if (!validate(input)) { + return `Output does not match the required schema: ${ajv.errorsText(validate.errors)}. Call structured_output again with corrected fields.`; + } + if (!done) { + captured = wrapped ? (input as { result: unknown }).result : input; + done = true; + } + return 'Accepted.'; + }, + }) as unknown as Tool; + + return { tool: t, getValue: () => captured }; +} diff --git a/packages/landlord/test/workflow/schema.test.ts b/packages/landlord/test/workflow/schema.test.ts new file mode 100644 index 0000000..0df9349 --- /dev/null +++ b/packages/landlord/test/workflow/schema.test.ts @@ -0,0 +1,35 @@ +// test/workflow/schema.test.ts +import { execute } from 'flint'; +import { describe, expect, it } from 'vitest'; +import { makeStructuredOutput } from '../../src/workflow/schema.ts'; + +describe('makeStructuredOutput', () => { + it('captures a valid object and reports success', async () => { + const so = makeStructuredOutput({ + type: 'object', + properties: { name: { type: 'string' } }, + required: ['name'], + }); + const res = await execute(so.tool, { name: 'ada' }); + expect(res.ok).toBe(true); + expect(so.getValue()).toEqual({ name: 'ada' }); + }); + + it('rejects an invalid object and leaves value undefined', async () => { + const so = makeStructuredOutput({ + type: 'object', + properties: { n: { type: 'number' } }, + required: ['n'], + }); + const res = await execute(so.tool, { n: 'not-a-number' }); + expect(res.ok).toBe(true); // handler returns an error string, not a thrown error + expect(String(res.ok ? res.value : '')).toMatch(/does not match/i); + expect(so.getValue()).toBeUndefined(); + }); + + it('wraps non-object schemas under a result key and unwraps the captured value', async () => { + const so = makeStructuredOutput({ type: 'array', items: { type: 'string' } }); + await execute(so.tool, { result: ['a', 'b'] }); + expect(so.getValue()).toEqual(['a', 'b']); + }); +}); From 70eebbf3c40341f2ab58dc35f4ecbe78fe13c675 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:45:58 -0600 Subject: [PATCH 11/22] feat(landlord): agent() hook with schema, isolation, journaling Co-Authored-By: Claude Sonnet 4.6 --- packages/landlord/src/workflow/agentcall.ts | 158 ++++++++++++++++++ .../landlord/test/workflow/agentcall.test.ts | 102 +++++++++++ 2 files changed, 260 insertions(+) create mode 100644 packages/landlord/src/workflow/agentcall.ts create mode 100644 packages/landlord/test/workflow/agentcall.test.ts diff --git a/packages/landlord/src/workflow/agentcall.ts b/packages/landlord/src/workflow/agentcall.ts new file mode 100644 index 0000000..f3c0a23 --- /dev/null +++ b/packages/landlord/src/workflow/agentcall.ts @@ -0,0 +1,158 @@ +// src/workflow/agentcall.ts +import { agent } from 'flint'; +import type { ProviderAdapter } from 'flint'; +import type { Budget } from 'flint/budget'; +import { standardTools } from '../tools/index.ts'; +import type { AgentCounter, Semaphore } from './concurrency.ts'; +import type { WorkflowBudget } from './budget.ts'; +import { WorkflowError } from './errors.ts'; +import type { EventEmitter } from './events.ts'; +import { hashCall } from './journal.ts'; +import type { JournalEntry, JournalStore } from './journal.ts'; +import type { IsolationBackend } from './isolation.ts'; +import type { AgentTypeRegistry, WorkflowRegistry } from './registry.ts'; +import { makeStructuredOutput } from './schema.ts'; +import type { AgentOpts, Models } from './types.ts'; + +export type RunDeps = { + adapter: ProviderAdapter; + models: Models; + flintBudget: Budget; + wfBudget: WorkflowBudget; + semaphore: Semaphore; + counter: AgentCounter; + registry: AgentTypeRegistry; + workflows: WorkflowRegistry | undefined; + isolation: IsolationBackend; + worktreeIsolation: IsolationBackend | undefined; + emitter: EventEmitter; + journal: JournalStore; + runId: string; + resumeEntries: JournalEntry[]; + signal: AbortSignal | undefined; + args: unknown; + depth: number; + nextIndex: () => number; + currentPhase: { value: string | undefined }; +}; + +export function deriveLabel(prompt: string): string { + const firstLine = prompt.split('\n', 1)[0] ?? prompt; + return firstLine.length > 48 ? `${firstLine.slice(0, 48)}…` : firstLine; +} + +export async function runAgentCall( + prompt: string, + opts: AgentOpts | undefined, + deps: RunDeps, +): Promise { + const index = deps.nextIndex(); + const hash = hashCall(prompt, opts ?? {}); + + // Resume: replay a cached result when this call's signature is unchanged. + const cached = deps.resumeEntries.find((e) => e.index === index); + if (cached !== undefined && cached.hash === hash) { + return cached.result; + } + + // Token-target ceiling. + if (deps.wfBudget.total !== null && deps.wfBudget.remaining() <= 0) { + throw new WorkflowError( + `Workflow token target (${deps.wfBudget.total}) reached`, + 'workflow.budget', + ); + } + + deps.counter.increment(); + const release = await deps.semaphore.acquire(); + + const label = opts?.label ?? deriveLabel(prompt); + const phase = opts?.phase; + const agentType = opts?.agentType ?? 'default'; + const preset = deps.registry.resolve(agentType); + const model = opts?.model ?? preset.model ?? deps.models.default; + const backend = + opts?.isolation === 'worktree' && deps.worktreeIsolation !== undefined + ? deps.worktreeIsolation + : deps.isolation; + const lease = await backend.acquire(label); + + deps.emitter.emit({ + type: 'agent_started', + label, + ...(phase !== undefined ? { phase } : {}), + agentType, + model, + }); + + try { + const baseTools = preset.tools ? preset.tools(lease.workDir) : standardTools(lease.workDir); + let result: unknown; + let tokens = 0; + + if (opts?.schema !== undefined) { + const so = makeStructuredOutput(opts.schema); + const systemPrompt = + `${preset.systemPrompt}\n\nYou MUST call the structured_output tool exactly once with your ` + + 'final result as JSON matching the required schema. Do not finish until you have called it.'; + const out = await agent({ + adapter: deps.adapter, + model, + messages: [ + { role: 'system', content: systemPrompt }, + { role: 'user', content: prompt }, + ], + tools: [so.tool, ...baseTools], + budget: deps.flintBudget, + ...(deps.signal !== undefined ? { signal: deps.signal } : {}), + }); + if (!out.ok) throw out.error; + deps.wfBudget.record(out.value.usage); + tokens = out.value.usage.input + out.value.usage.output; + const value = so.getValue(); + if (value === undefined) { + throw new WorkflowError( + `Agent '${label}' finished without producing structured output`, + 'workflow.no_output', + ); + } + result = value; + } else { + const out = await agent({ + adapter: deps.adapter, + model, + messages: [ + { role: 'system', content: preset.systemPrompt }, + { role: 'user', content: prompt }, + ], + tools: baseTools, + budget: deps.flintBudget, + ...(deps.signal !== undefined ? { signal: deps.signal } : {}), + }); + if (!out.ok) throw out.error; + deps.wfBudget.record(out.value.usage); + tokens = out.value.usage.input + out.value.usage.output; + result = out.value.message.content; + } + + deps.emitter.emit({ + type: 'agent_complete', + label, + ...(phase !== undefined ? { phase } : {}), + tokens, + }); + await deps.journal.append(deps.runId, { index, hash, result }); + return result; + } catch (e) { + deps.emitter.emit({ + type: 'agent_error', + label, + ...(phase !== undefined ? { phase } : {}), + error: e instanceof Error ? e.message : String(e), + }); + throw e; + } finally { + await lease.release(); + release(); + } +} diff --git a/packages/landlord/test/workflow/agentcall.test.ts b/packages/landlord/test/workflow/agentcall.test.ts new file mode 100644 index 0000000..0f40d3e --- /dev/null +++ b/packages/landlord/test/workflow/agentcall.test.ts @@ -0,0 +1,102 @@ +// test/workflow/agentcall.test.ts +import type { NormalizedResponse } from 'flint'; +import { budget as makeBudget } from 'flint/budget'; +import { mkdtemp } from 'node:fs/promises'; +import { tmpdir } from 'node:os'; +import { join } from 'node:path'; +import { mockAdapter, scriptedAdapter } from 'flint/testing'; +import { describe, expect, it } from 'vitest'; +import { WorkflowBudget } from '../../src/workflow/budget.ts'; +import { AgentCounter, Semaphore } from '../../src/workflow/concurrency.ts'; +import { EventEmitter } from '../../src/workflow/events.ts'; +import { memoryJournalStore } from '../../src/workflow/journal.ts'; +import { createAgentRegistry } from '../../src/workflow/registry.ts'; +import { workdirIsolation } from '../../src/workflow/isolation.ts'; +import { runAgentCall } from '../../src/workflow/agentcall.ts'; +import type { RunDeps } from '../../src/workflow/agentcall.ts'; + +function textResponse(content: string): NormalizedResponse { + return { + message: { role: 'assistant', content }, + usage: { input: 10, output: 5 }, + stopReason: 'end', + }; +} +function toolCallResponse(name: string, args: unknown): NormalizedResponse { + return { + message: { role: 'assistant', content: '', toolCalls: [{ id: 'tc1', name, arguments: args }] }, + usage: { input: 20, output: 10 }, + stopReason: 'tool_call', + }; +} + +async function makeDeps(adapter: RunDeps['adapter']): Promise { + const base = await mkdtemp(join(tmpdir(), 'ac-')); + let index = 0; + return { + adapter, + models: { default: 'test' }, + flintBudget: makeBudget({ maxSteps: 50 }), + wfBudget: new WorkflowBudget(null), + semaphore: new Semaphore(4), + counter: new AgentCounter(1000), + registry: createAgentRegistry(), + workflows: undefined, + isolation: workdirIsolation(base), + worktreeIsolation: undefined, + emitter: new EventEmitter(), + journal: memoryJournalStore(), + runId: 'run-test', + resumeEntries: [], + signal: undefined, + args: undefined, + depth: 0, + nextIndex: () => index++, + currentPhase: { value: undefined }, + }; +} + +describe('runAgentCall', () => { + it('returns the final text for a no-schema call and records the journal', async () => { + const deps = await makeDeps(scriptedAdapter([textResponse('hello world')])); + const result = await runAgentCall('say hi', undefined, deps); + expect(result).toBe('hello world'); + expect(await deps.journal.load('run-test')).toHaveLength(1); + expect(deps.emitter.all().map((e) => e.type)).toEqual(['agent_started', 'agent_complete']); + }); + + it('returns the validated object for a schema call', async () => { + const deps = await makeDeps( + scriptedAdapter([ + toolCallResponse('structured_output', { name: 'ada' }), + textResponse('done'), + ]), + ); + const result = await runAgentCall( + 'produce', + { + schema: { + type: 'object', + properties: { name: { type: 'string' } }, + required: ['name'], + }, + }, + deps, + ); + expect(result).toEqual({ name: 'ada' }); + }); + + it('replays a cached result on resume without calling the adapter', async () => { + const throwingAdapter = mockAdapter({ + onCall: () => { + throw new Error('must not call'); + }, + }); + const deps = await makeDeps(throwingAdapter); + // Pre-seed resume entry: index 0 with the matching hash for ('say hi', {}). + const { hashCall } = await import('../../src/workflow/journal.ts'); + deps.resumeEntries = [{ index: 0, hash: hashCall('say hi', {}), result: 'cached!' }]; + const result = await runAgentCall('say hi', undefined, deps); + expect(result).toBe('cached!'); + }); +}); From a2770f04deccefea2777cdf3e726bd0b38976c98 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:48:59 -0600 Subject: [PATCH 12/22] feat(landlord): workflow context (parallel/pipeline/phase/log) --- packages/landlord/src/workflow/hooks.ts | 61 +++++++++++++++++++ packages/landlord/test/workflow/hooks.test.ts | 52 ++++++++++++++++ 2 files changed, 113 insertions(+) create mode 100644 packages/landlord/src/workflow/hooks.ts create mode 100644 packages/landlord/test/workflow/hooks.test.ts diff --git a/packages/landlord/src/workflow/hooks.ts b/packages/landlord/src/workflow/hooks.ts new file mode 100644 index 0000000..0779726 --- /dev/null +++ b/packages/landlord/src/workflow/hooks.ts @@ -0,0 +1,61 @@ +import { runAgentCall } from './agentcall.ts'; +import type { RunDeps } from './agentcall.ts'; +import { budgetView } from './budget.ts'; +import type { AgentOpts, StageFn, WorkflowContext } from './types.ts'; + +export function buildContext( + deps: RunDeps, + workflowFn: WorkflowContext['workflow'], +): WorkflowContext { + const agent = (prompt: string, opts?: AgentOpts): Promise => { + const phase = opts?.phase ?? deps.currentPhase.value; + const merged: AgentOpts = { ...(opts ?? {}), ...(phase !== undefined ? { phase } : {}) }; + return runAgentCall(prompt, merged, deps); + }; + + const parallel = async (thunks: Array<() => Promise>): Promise> => + Promise.all( + thunks.map(async (thunk) => { + try { + return await thunk(); + } catch { + return null; + } + }), + ); + + const pipeline = async (items: unknown[], ...stages: StageFn[]): Promise => + Promise.all( + items.map(async (item, index) => { + let acc: unknown = item; + for (const stage of stages) { + try { + acc = await stage(acc, item, index); + } catch { + return null; + } + } + return acc; + }), + ); + + const phase = (title: string): void => { + deps.currentPhase.value = title; + deps.emitter.emit({ type: 'phase_started', title }); + }; + + const log = (message: string): void => { + deps.emitter.emit({ type: 'log', message }); + }; + + return { + agent, + parallel, + pipeline, + phase, + log, + args: deps.args, + budget: budgetView(deps.wfBudget), + workflow: workflowFn, + }; +} diff --git a/packages/landlord/test/workflow/hooks.test.ts b/packages/landlord/test/workflow/hooks.test.ts new file mode 100644 index 0000000..230a562 --- /dev/null +++ b/packages/landlord/test/workflow/hooks.test.ts @@ -0,0 +1,52 @@ +// test/workflow/hooks.test.ts +import { describe, expect, it } from 'vitest'; +import { WorkflowBudget } from '../../src/workflow/budget.ts'; +import { EventEmitter } from '../../src/workflow/events.ts'; +import { buildContext } from '../../src/workflow/hooks.ts'; +import type { RunDeps } from '../../src/workflow/agentcall.ts'; + +function fakeDeps(): RunDeps { + return { + emitter: new EventEmitter(), + wfBudget: new WorkflowBudget(100), + args: { topic: 'x' }, + currentPhase: { value: undefined }, + // unused-by-these-tests fields: + } as unknown as RunDeps; +} + +describe('buildContext combinators', () => { + it('parallel maps a throwing thunk to null', async () => { + const ctx = buildContext(fakeDeps(), async () => null); + const out = await ctx.parallel([ + async () => 1, + async () => { + throw new Error('x'); + }, + ]); + expect(out).toEqual([1, null]); + }); + + it('pipeline runs stages per item and drops a throwing item to null', async () => { + const ctx = buildContext(fakeDeps(), async () => null); + const out = await ctx.pipeline( + [1, 2], + (prev) => (prev as number) + 1, + (prev, original, i) => { + if (original === 2) throw new Error('boom'); + return `${prev}@${i}`; + }, + ); + expect(out).toEqual(['2@0', null]); + }); + + it('phase and log emit events; budget and args are exposed', async () => { + const deps = fakeDeps(); + const ctx = buildContext(deps, async () => null); + ctx.phase('Find'); + ctx.log('looking'); + expect(deps.emitter.all().map((e) => e.type)).toEqual(['phase_started', 'log']); + expect(ctx.budget.total).toBe(100); + expect(ctx.args).toEqual({ topic: 'x' }); + }); +}); From 3af142bbd3e433abac6e57e9274a2ca7ad15cddb Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:52:04 -0600 Subject: [PATCH 13/22] feat(landlord): meta literal parser and determinism sandbox Co-Authored-By: Claude Sonnet 4.6 --- packages/landlord/src/workflow/meta.ts | 119 ++++++++++++++++++ packages/landlord/src/workflow/sandbox.ts | 40 ++++++ packages/landlord/test/workflow/meta.test.ts | 22 ++++ .../landlord/test/workflow/sandbox.test.ts | 16 +++ 4 files changed, 197 insertions(+) create mode 100644 packages/landlord/src/workflow/meta.ts create mode 100644 packages/landlord/src/workflow/sandbox.ts create mode 100644 packages/landlord/test/workflow/meta.test.ts create mode 100644 packages/landlord/test/workflow/sandbox.test.ts diff --git a/packages/landlord/src/workflow/meta.ts b/packages/landlord/src/workflow/meta.ts new file mode 100644 index 0000000..cce4967 --- /dev/null +++ b/packages/landlord/src/workflow/meta.ts @@ -0,0 +1,119 @@ +import { MetaError } from './errors.ts'; +import type { Meta } from './types.ts'; + +function skipWs(s: string, i: number): number { + while (i < s.length) { + const c = s[i]; + if (c === ' ' || c === '\n' || c === '\t' || c === '\r') i++; + else break; + } + return i; +} + +function parseString(s: string, i: number): { value: string; end: number } { + const quote = s[i]; + i++; + let out = ''; + while (i < s.length && s[i] !== quote) { + if (s[i] === '\\') { + const n = s[i + 1]; + out += + n === 'n' + ? '\n' + : n === 't' + ? '\t' + : n === 'r' + ? '\r' + : n === '\\' + ? '\\' + : n === quote + ? quote + : (n ?? ''); + i += 2; + } else { + out += s[i]; + i++; + } + } + if (s[i] !== quote) throw new MetaError('Unterminated string in meta literal'); + return { value: out, end: i + 1 }; +} + +function parseNumber(s: string, i: number): { value: number; end: number } { + const m = /^-?\d+(\.\d+)?([eE][+-]?\d+)?/.exec(s.slice(i)); + if (!m) throw new MetaError('Invalid number in meta literal'); + return { value: Number(m[0]), end: i + m[0].length }; +} + +function parseKey(s: string, i: number): { value: string; end: number } { + const c = s[i]; + if (c === '"' || c === "'") return parseString(s, i); + const m = /^[A-Za-z_$][\w$]*/.exec(s.slice(i)); + if (!m) throw new MetaError(`Invalid object key at index ${i}`); + return { value: m[0], end: i + m[0].length }; +} + +export function parseLiteral(s: string, start = 0): { value: unknown; end: number } { + const i = skipWs(s, start); + const ch = s[i]; + if (ch === '{') { + let j = skipWs(s, i + 1); + const obj: Record = {}; + if (s[j] === '}') return { value: obj, end: j + 1 }; + while (j < s.length) { + j = skipWs(s, j); + const key = parseKey(s, j); + j = skipWs(s, key.end); + if (s[j] !== ':') throw new MetaError(`Expected ':' at index ${j}`); + const val = parseLiteral(s, j + 1); + obj[key.value] = val.value; + j = skipWs(s, val.end); + if (s[j] === ',') { + j = skipWs(s, j + 1); + if (s[j] === '}') return { value: obj, end: j + 1 }; + continue; + } + if (s[j] === '}') return { value: obj, end: j + 1 }; + throw new MetaError(`Expected ',' or '}' at index ${j}`); + } + throw new MetaError('Unterminated object in meta literal'); + } + if (ch === '[') { + let j = skipWs(s, i + 1); + const arr: unknown[] = []; + if (s[j] === ']') return { value: arr, end: j + 1 }; + while (j < s.length) { + const val = parseLiteral(s, j); + arr.push(val.value); + j = skipWs(s, val.end); + if (s[j] === ',') { + j = skipWs(s, j + 1); + if (s[j] === ']') return { value: arr, end: j + 1 }; + continue; + } + if (s[j] === ']') return { value: arr, end: j + 1 }; + throw new MetaError(`Expected ',' or ']' at index ${j}`); + } + throw new MetaError('Unterminated array in meta literal'); + } + if (ch === '"' || ch === "'") return parseString(s, i); + if (ch === '-' || (ch !== undefined && ch >= '0' && ch <= '9')) return parseNumber(s, i); + if (s.startsWith('true', i)) return { value: true, end: i + 4 }; + if (s.startsWith('false', i)) return { value: false, end: i + 5 }; + if (s.startsWith('null', i)) return { value: null, end: i + 4 }; + throw new MetaError(`Unexpected token in meta literal at index ${i}: '${s.slice(i, i + 12)}'`); +} + +export function parseMeta(source: string): Meta { + const m = /export\s+const\s+meta\s*=/.exec(source); + if (!m) throw new MetaError('Script is missing `export const meta = { ... }`'); + const { value } = parseLiteral(source, m.index + m[0].length); + if (typeof value !== 'object' || value === null || Array.isArray(value)) { + throw new MetaError('meta must be an object literal'); + } + const meta = value as Record; + if (typeof meta['name'] !== 'string' || typeof meta['description'] !== 'string') { + throw new MetaError('meta requires string `name` and `description`'); + } + return meta as unknown as Meta; +} diff --git a/packages/landlord/src/workflow/sandbox.ts b/packages/landlord/src/workflow/sandbox.ts new file mode 100644 index 0000000..4520a7e --- /dev/null +++ b/packages/landlord/src/workflow/sandbox.ts @@ -0,0 +1,40 @@ +import { WorkflowError } from './errors.ts'; + +function forbidden(name: string): never { + throw new WorkflowError( + `${name} is not available inside a workflow script (nondeterministic or host access)`, + 'workflow.sandbox', + ); +} + +function blockedCallable(name: string): unknown { + const fn = (): never => forbidden(name); + return new Proxy(fn, { + apply: () => forbidden(name), + construct: () => forbidden(name), + get: (_t, prop) => { + if (prop === 'prototype') return undefined; + return () => forbidden(name); + }, + }); +} + +export function sandboxBindings(): Record { + const safeMath = new Proxy(Math, { + get: (target, prop) => { + if (prop === 'random') return () => forbidden('Math.random'); + return Reflect.get(target, prop); + }, + }); + return { + Date: blockedCallable('Date'), + Math: safeMath, + process: blockedCallable('process'), + require: blockedCallable('require'), + globalThis: blockedCallable('globalThis'), + global: blockedCallable('global'), + fs: blockedCallable('fs'), + eval: blockedCallable('eval'), + Function: blockedCallable('Function'), + }; +} diff --git a/packages/landlord/test/workflow/meta.test.ts b/packages/landlord/test/workflow/meta.test.ts new file mode 100644 index 0000000..7007f80 --- /dev/null +++ b/packages/landlord/test/workflow/meta.test.ts @@ -0,0 +1,22 @@ +import { describe, expect, it } from 'vitest'; +import { MetaError } from '../../src/workflow/errors.ts'; +import { parseMeta } from '../../src/workflow/meta.ts'; + +describe('parseMeta', () => { + it('parses a pure object literal with nested arrays', () => { + const meta = parseMeta( + `export const meta = { name: 'rev', description: "Review", phases: [{ title: 'A' }, { title: 'B', detail: 'x' }] }\nphase('A')`, + ); + expect(meta.name).toBe('rev'); + expect(meta.description).toBe('Review'); + expect(meta.phases).toEqual([{ title: 'A' }, { title: 'B', detail: 'x' }]); + }); + + it('rejects a non-literal value (function call) in meta', () => { + expect(() => parseMeta(`export const meta = { name: foo(), description: 'x' }`)).toThrow(MetaError); + }); + + it('rejects meta missing name/description', () => { + expect(() => parseMeta(`export const meta = { name: 'x' }`)).toThrow(MetaError); + }); +}); diff --git a/packages/landlord/test/workflow/sandbox.test.ts b/packages/landlord/test/workflow/sandbox.test.ts new file mode 100644 index 0000000..dbc9ce4 --- /dev/null +++ b/packages/landlord/test/workflow/sandbox.test.ts @@ -0,0 +1,16 @@ +import { describe, expect, it } from 'vitest'; +import { sandboxBindings } from '../../src/workflow/sandbox.ts'; + +describe('sandboxBindings', () => { + it('blocks Date, Math.random, and process but allows pure Math', () => { + const b = sandboxBindings(); + const D = b['Date'] as { now: () => number }; + const M = b['Math'] as Math; + const P = b['process'] as { cwd: () => string }; + expect(() => D.now()).toThrow(); + expect(() => new (b['Date'] as unknown as new () => unknown)()).toThrow(); + expect(() => M.random()).toThrow(); + expect(M.floor(3.7)).toBe(3); + expect(() => P.cwd()).toThrow(); + }); +}); From 3d50e098d58e5b6df98c90919cb0dbdce83daa8d Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 16:55:53 -0600 Subject: [PATCH 14/22] feat(landlord): script compiler and typed defineWorkflow --- packages/landlord/src/workflow/define.ts | 17 +++++++ packages/landlord/src/workflow/script.ts | 39 +++++++++++++++ .../landlord/test/workflow/script.test.ts | 48 +++++++++++++++++++ 3 files changed, 104 insertions(+) create mode 100644 packages/landlord/src/workflow/define.ts create mode 100644 packages/landlord/src/workflow/script.ts create mode 100644 packages/landlord/test/workflow/script.test.ts diff --git a/packages/landlord/src/workflow/define.ts b/packages/landlord/src/workflow/define.ts new file mode 100644 index 0000000..ca94400 --- /dev/null +++ b/packages/landlord/src/workflow/define.ts @@ -0,0 +1,17 @@ +// src/workflow/define.ts +import { MetaError } from './errors.ts'; +import type { WorkflowModule } from './types.ts'; + +export function defineWorkflow(def: WorkflowModule): WorkflowModule { + if ( + def.meta === undefined || + typeof def.meta.name !== 'string' || + typeof def.meta.description !== 'string' + ) { + throw new MetaError('defineWorkflow requires meta with string name and description'); + } + if (typeof def.run !== 'function') { + throw new MetaError('defineWorkflow requires a run function'); + } + return def; +} diff --git a/packages/landlord/src/workflow/script.ts b/packages/landlord/src/workflow/script.ts new file mode 100644 index 0000000..ed9b3c3 --- /dev/null +++ b/packages/landlord/src/workflow/script.ts @@ -0,0 +1,39 @@ +// src/workflow/script.ts +import { parseMeta } from './meta.ts'; +import { sandboxBindings } from './sandbox.ts'; +import type { WorkflowContext, WorkflowModule } from './types.ts'; + +const AsyncFunction = Object.getPrototypeOf(async () => {}).constructor as new ( + ...args: string[] +) => (...args: unknown[]) => Promise; + +export function stripModuleSyntax(source: string): string { + let body = source.replace(/export\s+const\s+meta\s*=/, 'const __meta__ ='); + body = body.replace(/^\s*import\s.*$/gm, ''); + body = body.replace(/export\s+default\s+/g, 'return '); + body = body.replace(/export\s+(const|let|var|function|class)\s/g, '$1 '); + return body; +} + +export function compileScript(source: string): WorkflowModule { + const meta = parseMeta(source); + const body = stripModuleSyntax(source); + const sandbox = sandboxBindings(); + const sandboxNames = Object.keys(sandbox); + const sandboxValues = sandboxNames.map((n) => sandbox[n]); + const hookNames = ['agent', 'parallel', 'pipeline', 'phase', 'log', 'args', 'budget', 'workflow']; + const fn = new AsyncFunction(...hookNames, ...sandboxNames, body); + const run = (wf: WorkflowContext): Promise => + fn( + wf.agent, + wf.parallel, + wf.pipeline, + wf.phase, + wf.log, + wf.args, + wf.budget, + wf.workflow, + ...sandboxValues, + ); + return { meta, run }; +} diff --git a/packages/landlord/test/workflow/script.test.ts b/packages/landlord/test/workflow/script.test.ts new file mode 100644 index 0000000..e7c4081 --- /dev/null +++ b/packages/landlord/test/workflow/script.test.ts @@ -0,0 +1,48 @@ +// test/workflow/script.test.ts +import { describe, expect, it } from 'vitest'; +import { compileScript } from '../../src/workflow/script.ts'; +import { defineWorkflow } from '../../src/workflow/define.ts'; +import { MetaError } from '../../src/workflow/errors.ts'; +import type { WorkflowContext } from '../../src/workflow/types.ts'; + +function fakeCtx(calls: string[]): WorkflowContext { + return { + agent: async (p) => { + calls.push(`agent:${p}`); + return 'R'; + }, + parallel: async (thunks) => Promise.all(thunks.map((t) => t())), + pipeline: async (items) => items, + phase: () => {}, + log: (m) => calls.push(`log:${m}`), + args: { n: 2 }, + budget: { total: null, spent: () => 0, remaining: () => Number.POSITIVE_INFINITY }, + workflow: async () => null, + }; +} + +describe('compileScript', () => { + it('parses meta, injects hooks, supports top-level await and return', async () => { + const mod = compileScript( + `export const meta = { name: 'x', description: 'y' }\nlog('hi')\nconst r = await agent('do ' + args.n)\nreturn r`, + ); + expect(mod.meta.name).toBe('x'); + const calls: string[] = []; + const result = await mod.run(fakeCtx(calls)); + expect(result).toBe('R'); + expect(calls).toEqual(['log:hi', 'agent:do 2']); + }); + + it('blocks nondeterministic globals at runtime', async () => { + const mod = compileScript(`export const meta = { name: 'a', description: 'b' }\nreturn Date.now()`); + await expect(mod.run(fakeCtx([]))).rejects.toThrow(); + }); +}); + +describe('defineWorkflow', () => { + it('returns the module and validates meta', () => { + const mod = defineWorkflow({ meta: { name: 'm', description: 'd' }, run: async () => 42 }); + expect(mod.meta.name).toBe('m'); + expect(() => defineWorkflow({ meta: { name: 'm' } as never, run: async () => 1 })).toThrow(MetaError); + }); +}); From daa3666eadf4fcd81174d11e278107f48c9f9adc Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 17:00:06 -0600 Subject: [PATCH 15/22] fix(landlord): block this-based workflow sandbox escape Co-Authored-By: Claude Sonnet 4.6 --- packages/landlord/src/workflow/script.ts | 23 +++++++++++-------- .../landlord/test/workflow/script.test.ts | 7 ++++++ 2 files changed, 20 insertions(+), 10 deletions(-) diff --git a/packages/landlord/src/workflow/script.ts b/packages/landlord/src/workflow/script.ts index ed9b3c3..cd8f1dc 100644 --- a/packages/landlord/src/workflow/script.ts +++ b/packages/landlord/src/workflow/script.ts @@ -1,4 +1,5 @@ // src/workflow/script.ts +import { WorkflowError } from './errors.ts'; import { parseMeta } from './meta.ts'; import { sandboxBindings } from './sandbox.ts'; import type { WorkflowContext, WorkflowModule } from './types.ts'; @@ -7,6 +8,15 @@ const AsyncFunction = Object.getPrototypeOf(async () => {}).constructor as new ( ...args: string[] ) => (...args: unknown[]) => Promise; +const SANDBOX_THIS = new Proxy(Object.freeze({}), { + get(_target, prop) { + throw new WorkflowError( + `this.${String(prop)} is not available inside a workflow script`, + 'workflow.sandbox', + ); + }, +}); + export function stripModuleSyntax(source: string): string { let body = source.replace(/export\s+const\s+meta\s*=/, 'const __meta__ ='); body = body.replace(/^\s*import\s.*$/gm, ''); @@ -24,16 +34,9 @@ export function compileScript(source: string): WorkflowModule { const hookNames = ['agent', 'parallel', 'pipeline', 'phase', 'log', 'args', 'budget', 'workflow']; const fn = new AsyncFunction(...hookNames, ...sandboxNames, body); const run = (wf: WorkflowContext): Promise => - fn( - wf.agent, - wf.parallel, - wf.pipeline, - wf.phase, - wf.log, - wf.args, - wf.budget, - wf.workflow, + fn.apply(SANDBOX_THIS, [ + wf.agent, wf.parallel, wf.pipeline, wf.phase, wf.log, wf.args, wf.budget, wf.workflow, ...sandboxValues, - ); + ]); return { meta, run }; } diff --git a/packages/landlord/test/workflow/script.test.ts b/packages/landlord/test/workflow/script.test.ts index e7c4081..18d2e22 100644 --- a/packages/landlord/test/workflow/script.test.ts +++ b/packages/landlord/test/workflow/script.test.ts @@ -37,6 +37,13 @@ describe('compileScript', () => { const mod = compileScript(`export const meta = { name: 'a', description: 'b' }\nreturn Date.now()`); await expect(mod.run(fakeCtx([]))).rejects.toThrow(); }); + + it('blocks sandbox escape via this', async () => { + const mod = compileScript( + `export const meta = { name: 'a', description: 'b' }\nreturn this.Date.now()`, + ); + await expect(mod.run(fakeCtx([]))).rejects.toThrow(); + }); }); describe('defineWorkflow', () => { From be7a33aa4679ba304c1982c7a3fc1dc47363072f Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 17:02:54 -0600 Subject: [PATCH 16/22] feat(landlord): workflow run engine with resume and nesting Co-Authored-By: Claude Sonnet 4.6 --- packages/landlord/src/workflow/runtime.ts | 129 ++++++++++++++++++ .../landlord/test/workflow/runtime.test.ts | 64 +++++++++ 2 files changed, 193 insertions(+) create mode 100644 packages/landlord/src/workflow/runtime.ts create mode 100644 packages/landlord/test/workflow/runtime.test.ts diff --git a/packages/landlord/src/workflow/runtime.ts b/packages/landlord/src/workflow/runtime.ts new file mode 100644 index 0000000..1e19480 --- /dev/null +++ b/packages/landlord/src/workflow/runtime.ts @@ -0,0 +1,129 @@ +import { randomUUID } from 'node:crypto'; +import { mkdir } from 'node:fs/promises'; +import { tmpdir } from 'node:os'; +import { join } from 'node:path'; +import type { ProviderAdapter, Result } from 'flint'; +import { budget as makeBudget } from 'flint/budget'; +import type { Budget } from 'flint/budget'; +import type { RunDeps } from './agentcall.ts'; +import { WorkflowBudget } from './budget.ts'; +import { AgentCounter, Semaphore, defaultConcurrency } from './concurrency.ts'; +import { WorkflowError } from './errors.ts'; +import { EventEmitter } from './events.ts'; +import type { EventSink } from './events.ts'; +import { buildContext } from './hooks.ts'; +import { gitWorktreeIsolation, workdirIsolation } from './isolation.ts'; +import type { IsolationBackend } from './isolation.ts'; +import { memoryJournalStore } from './journal.ts'; +import type { JournalStore } from './journal.ts'; +import { createAgentRegistry } from './registry.ts'; +import type { AgentTypeRegistry, WorkflowRegistry } from './registry.ts'; +import { compileScript } from './script.ts'; +import type { Models, WorkflowContext, WorkflowModule, WorkflowRunResult } from './types.ts'; + +export type RuntimeConfig = { + adapter: ProviderAdapter; + models: Models; + args?: unknown; + budget?: Budget; + tokenTarget?: number | null; + registry?: AgentTypeRegistry; + workflows?: WorkflowRegistry; + journal?: JournalStore; + isolation?: IsolationBackend; + worktreeRepoDir?: string; + baseDir?: string; + concurrency?: number; + agentCap?: number; + onEvent?: EventSink; + signal?: AbortSignal; + runId?: string; + resumeFromRunId?: string; +}; + +async function buildDeps(config: RuntimeConfig): Promise { + const runId = config.runId ?? randomUUID().slice(0, 8); + const baseDir = config.baseDir ?? join(tmpdir(), `flint-workflow-${runId}`); + await mkdir(baseDir, { recursive: true }); + const journal = config.journal ?? memoryJournalStore(); + const resumeEntries = + config.resumeFromRunId !== undefined ? await journal.load(config.resumeFromRunId) : []; + let index = 0; + return { + adapter: config.adapter, + models: config.models, + flintBudget: config.budget ?? makeBudget({ maxSteps: 1_000_000 }), + wfBudget: new WorkflowBudget(config.tokenTarget ?? null), + semaphore: new Semaphore(config.concurrency ?? defaultConcurrency()), + counter: new AgentCounter(config.agentCap ?? 1000), + registry: config.registry ?? createAgentRegistry(), + workflows: config.workflows, + isolation: config.isolation ?? workdirIsolation(baseDir), + ...(config.worktreeRepoDir !== undefined + ? { worktreeIsolation: gitWorktreeIsolation(config.worktreeRepoDir, baseDir) } + : { worktreeIsolation: undefined }), + emitter: new EventEmitter(config.onEvent), + journal, + runId, + resumeEntries, + ...(config.signal !== undefined ? { signal: config.signal } : { signal: undefined }), + args: config.args, + depth: 0, + nextIndex: () => index++, + currentPhase: { value: undefined }, + }; +} + +function resolveSource( + ref: string | { scriptPath?: string; source?: string }, + workflows: WorkflowRegistry | undefined, +): string { + if (typeof ref === 'string') { + const src = workflows?.resolve(ref); + if (src === undefined) throw new WorkflowError(`Unknown workflow '${ref}'`, 'workflow.unknown'); + return src; + } + if (ref.source !== undefined) return ref.source; + throw new WorkflowError( + 'workflow(): provide a registered name or { source }; { scriptPath } must be read by the caller.', + 'workflow.unknown', + ); +} + +function executeModule(module: WorkflowModule, deps: RunDeps): Promise { + const workflowFn: WorkflowContext['workflow'] = async (ref, childArgs) => { + if (deps.depth >= 1) { + throw new WorkflowError('workflow() nesting is one level only', 'workflow.nesting'); + } + const child = compileScript(resolveSource(ref, deps.workflows)); + return executeModule(child, { ...deps, depth: deps.depth + 1, args: childArgs }); + }; + return module.run(buildContext(deps, workflowFn)); +} + +export async function runWorkflow( + module: WorkflowModule, + config: RuntimeConfig, +): Promise> { + const deps = await buildDeps(config); + try { + const result = await executeModule(module, deps); + deps.emitter.emit({ type: 'workflow_complete', result }); + return { ok: true, value: { runId: deps.runId, result, events: deps.emitter.all() } }; + } catch (e) { + return { ok: false, error: e instanceof Error ? e : new Error(String(e)) }; + } +} + +export async function runWorkflowScript( + source: string, + config: RuntimeConfig, +): Promise> { + let module: WorkflowModule; + try { + module = compileScript(source); + } catch (e) { + return { ok: false, error: e instanceof Error ? e : new Error(String(e)) }; + } + return runWorkflow(module, config); +} diff --git a/packages/landlord/test/workflow/runtime.test.ts b/packages/landlord/test/workflow/runtime.test.ts new file mode 100644 index 0000000..c818114 --- /dev/null +++ b/packages/landlord/test/workflow/runtime.test.ts @@ -0,0 +1,64 @@ +// test/workflow/runtime.test.ts +import type { NormalizedResponse } from 'flint'; +import { mockAdapter, scriptedAdapter } from 'flint/testing'; +import { describe, expect, it } from 'vitest'; +import { memoryJournalStore } from '../../src/workflow/journal.ts'; +import { runWorkflow, runWorkflowScript } from '../../src/workflow/runtime.ts'; +import { defineWorkflow } from '../../src/workflow/define.ts'; + +function textResponse(content: string): NormalizedResponse { + return { message: { role: 'assistant', content }, usage: { input: 10, output: 5 }, stopReason: 'end' }; +} + +describe('runWorkflow', () => { + it('runs a single-agent script and reports events', async () => { + const adapter = scriptedAdapter([textResponse('hello')]); + const res = await runWorkflowScript( + `export const meta = { name: 'r', description: 'd' }\nreturn await agent('hi')`, + { adapter, models: { default: 'm' } }, + ); + expect(res.ok).toBe(true); + if (res.ok) { + expect(res.value.result).toBe('hello'); + expect(res.value.events.map((e) => e.type)).toContain('workflow_complete'); + } + }); + + it('replays from a prior run without calling the adapter (resume)', async () => { + const journal = memoryJournalStore(); + const source = `export const meta = { name: 'r', description: 'd' }\nreturn await agent('hi')`; + const r1 = await runWorkflowScript(source, { + adapter: scriptedAdapter([textResponse('hello')]), + models: { default: 'm' }, + journal, + runId: 'run1', + }); + expect(r1.ok && r1.value.result).toBe('hello'); + + const throwing = mockAdapter({ onCall: () => { throw new Error('must not be called'); } }); + const r2 = await runWorkflowScript(source, { + adapter: throwing, + models: { default: 'm' }, + journal, + runId: 'run2', + resumeFromRunId: 'run1', + }); + expect(r2.ok && r2.value.result).toBe('hello'); + }); + + it('runs a typed workflow via runWorkflow', async () => { + const mod = defineWorkflow({ + meta: { name: 't', description: 'd' }, + run: async (wf) => { + wf.phase('Work'); + return wf.budget.total; + }, + }); + const res = await runWorkflow(mod, { + adapter: scriptedAdapter([]), + models: { default: 'm' }, + tokenTarget: 500, + }); + expect(res.ok && res.value.result).toBe(500); + }); +}); From 8f432c75ed156a9ae39d93ef56f964271769cf32 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 17:06:28 -0600 Subject: [PATCH 17/22] fix(landlord): re-journal replayed entries for multi-hop resume When resuming from a prior run, replayed journal entries are now written into the current run's journal before returning, making each run's journal self-contained and enabling chains of resume beyond two hops. Co-Authored-By: Claude Sonnet 4.6 --- packages/landlord/src/workflow/agentcall.ts | 1 + .../landlord/test/workflow/runtime.test.ts | 43 ++++++++++++++++++- 2 files changed, 42 insertions(+), 2 deletions(-) diff --git a/packages/landlord/src/workflow/agentcall.ts b/packages/landlord/src/workflow/agentcall.ts index f3c0a23..839b57b 100644 --- a/packages/landlord/src/workflow/agentcall.ts +++ b/packages/landlord/src/workflow/agentcall.ts @@ -52,6 +52,7 @@ export async function runAgentCall( // Resume: replay a cached result when this call's signature is unchanged. const cached = deps.resumeEntries.find((e) => e.index === index); if (cached !== undefined && cached.hash === hash) { + await deps.journal.append(deps.runId, { index, hash, result: cached.result }); return cached.result; } diff --git a/packages/landlord/test/workflow/runtime.test.ts b/packages/landlord/test/workflow/runtime.test.ts index c818114..f7cd0b3 100644 --- a/packages/landlord/test/workflow/runtime.test.ts +++ b/packages/landlord/test/workflow/runtime.test.ts @@ -7,7 +7,11 @@ import { runWorkflow, runWorkflowScript } from '../../src/workflow/runtime.ts'; import { defineWorkflow } from '../../src/workflow/define.ts'; function textResponse(content: string): NormalizedResponse { - return { message: { role: 'assistant', content }, usage: { input: 10, output: 5 }, stopReason: 'end' }; + return { + message: { role: 'assistant', content }, + usage: { input: 10, output: 5 }, + stopReason: 'end', + }; } describe('runWorkflow', () => { @@ -35,7 +39,11 @@ describe('runWorkflow', () => { }); expect(r1.ok && r1.value.result).toBe('hello'); - const throwing = mockAdapter({ onCall: () => { throw new Error('must not be called'); } }); + const throwing = mockAdapter({ + onCall: () => { + throw new Error('must not be called'); + }, + }); const r2 = await runWorkflowScript(source, { adapter: throwing, models: { default: 'm' }, @@ -46,6 +54,37 @@ describe('runWorkflow', () => { expect(r2.ok && r2.value.result).toBe('hello'); }); + it('supports two-hop resume by re-journaling replayed entries', async () => { + const journal = memoryJournalStore(); + const source = `export const meta = { name: 'r', description: 'd' }\nreturn await agent('hi')`; + await runWorkflowScript(source, { + adapter: scriptedAdapter([textResponse('hello')]), + models: { default: 'm' }, + journal, + runId: 'run1', + }); + const throwing = mockAdapter({ + onCall: () => { + throw new Error('must not be called'); + }, + }); + await runWorkflowScript(source, { + adapter: throwing, + models: { default: 'm' }, + journal, + runId: 'run2', + resumeFromRunId: 'run1', + }); + const r3 = await runWorkflowScript(source, { + adapter: throwing, + models: { default: 'm' }, + journal, + runId: 'run3', + resumeFromRunId: 'run2', + }); + expect(r3.ok && r3.value.result).toBe('hello'); + }); + it('runs a typed workflow via runWorkflow', async () => { const mod = defineWorkflow({ meta: { name: 't', description: 'd' }, From 630be770e077a8b6659873fcead4d1161f7c62fc Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 17:08:48 -0600 Subject: [PATCH 18/22] feat(landlord): workflowTool, orchestratorAgent, and tool guide --- packages/landlord/src/workflow/tool.ts | 131 +++++++++++++++++++ packages/landlord/test/workflow/tool.test.ts | 44 +++++++ 2 files changed, 175 insertions(+) create mode 100644 packages/landlord/src/workflow/tool.ts create mode 100644 packages/landlord/test/workflow/tool.test.ts diff --git a/packages/landlord/src/workflow/tool.ts b/packages/landlord/src/workflow/tool.ts new file mode 100644 index 0000000..df2a7df --- /dev/null +++ b/packages/landlord/src/workflow/tool.ts @@ -0,0 +1,131 @@ +import { agent, tool } from 'flint'; +import type { ProviderAdapter, Result, Tool } from 'flint'; +import { budget as makeBudget } from 'flint/budget'; +import type { Budget } from 'flint/budget'; +import { z } from 'zod'; +import type { EventSink } from './events.ts'; +import type { IsolationBackend } from './isolation.ts'; +import type { JournalStore } from './journal.ts'; +import type { AgentTypeRegistry, WorkflowRegistry } from './registry.ts'; +import { runWorkflowScript } from './runtime.ts'; +import type { RuntimeConfig } from './runtime.ts'; +import type { Models } from './types.ts'; + +export type WorkflowToolConfig = { + adapter: ProviderAdapter; + models: Models; + registry?: AgentTypeRegistry; + workflows?: WorkflowRegistry; + journal?: JournalStore; + isolation?: IsolationBackend; + onEvent?: EventSink; +}; + +const workflowToolSchema = z.object({ + script: z.string().optional(), + args: z.unknown().optional(), + name: z.string().optional(), + resumeFromRunId: z.string().optional(), +}); + +const WORKFLOW_TOOL_JSON_SCHEMA = { + type: 'object', + properties: { + script: { + type: 'string', + description: 'A workflow JS script beginning with `export const meta = { ... }`.', + }, + args: { description: 'Optional value exposed to the script as `args`.' }, + name: { + type: 'string', + description: 'Name of a registered workflow to run instead of `script`.', + }, + resumeFromRunId: { + type: 'string', + description: 'Resume a prior run, replaying unchanged agents.', + }, + }, +}; + +export function workflowTool(config: WorkflowToolConfig): Tool { + return tool({ + name: 'workflow', + description: + 'Author and run a dynamic multi-agent workflow. Provide a `script` that orchestrates ' + + 'subagents with agent()/parallel()/pipeline()/phase()/log()/budget()/workflow(). ' + + 'Returns JSON { runId, result }.', + input: workflowToolSchema, + jsonSchema: WORKFLOW_TOOL_JSON_SCHEMA, + handler: async (input) => { + let source = input.script; + if (source === undefined && input.name !== undefined) { + source = config.workflows?.resolve(input.name); + } + if (source === undefined) { + return 'Error: provide either a `script` string or a registered `name`.'; + } + const runtimeConfig: RuntimeConfig = { + adapter: config.adapter, + models: config.models, + ...(config.registry !== undefined ? { registry: config.registry } : {}), + ...(config.workflows !== undefined ? { workflows: config.workflows } : {}), + ...(config.journal !== undefined ? { journal: config.journal } : {}), + ...(config.isolation !== undefined ? { isolation: config.isolation } : {}), + ...(config.onEvent !== undefined ? { onEvent: config.onEvent } : {}), + ...(input.args !== undefined ? { args: input.args } : {}), + ...(input.resumeFromRunId !== undefined ? { resumeFromRunId: input.resumeFromRunId } : {}), + }; + const res = await runWorkflowScript(source, runtimeConfig); + if (!res.ok) return `Error: ${res.error.message}`; + return JSON.stringify({ runId: res.value.runId, result: res.value.result }); + }, + }) as unknown as Tool; +} + +export function orchestratorAgent(config: WorkflowToolConfig) { + const wt = workflowTool(config); + return (prompt: string, opts?: { budget?: Budget; model?: string }): ReturnType => + agent({ + adapter: config.adapter, + model: opts?.model ?? config.models.default, + messages: [ + { role: 'system', content: WORKFLOW_TOOL_GUIDE }, + { role: 'user', content: prompt }, + ], + tools: [wt], + budget: opts?.budget ?? makeBudget({ maxSteps: 50 }), + }); +} + +export const WORKFLOW_TOOL_GUIDE = `You can orchestrate subagents by writing a workflow script and running it with the \`workflow\` tool. + +A script begins with a pure-literal meta block, then a body using injected hooks: + + export const meta = { name: 'review', description: 'Review changes and verify findings' } + phase('Find') + const findings = await parallel(FINDERS.map(f => () => agent(f.prompt, { schema: FINDINGS }))) + return findings.flat().filter(Boolean) + +Hooks available in the script: +- agent(prompt, opts?) — spawn a subagent. Without a schema it returns the agent's final text; with { schema } (a JSON Schema) it is forced to return a validated object. opts: { label, phase, schema, model, isolation: 'worktree', agentType }. +- parallel(thunks) — run thunks concurrently. This is a BARRIER: it awaits all of them. A thunk that throws becomes null in the result array, so filter(Boolean) before use. +- pipeline(items, ...stages) — run each item through every stage independently, with NO barrier between stages. Each stage receives (prevResult, originalItem, index). A throwing stage drops that item to null. This is the DEFAULT for multi-stage work. +- phase(title) / log(message) — progress grouping and narration. +- args — the input value passed to the run. +- budget — { total, spent(), remaining() } in output tokens; total may be null. Use for loops: while (budget.total && budget.remaining() > 50000) { ... }. +- workflow(nameOrRef, args?) — run another registered workflow inline (one level only). + +Determinism: Date.now(), new Date(), and Math.random() are unavailable inside scripts (they throw) so runs can be resumed. Pass timestamps via args; vary by index for pseudo-randomness. + +Concurrency is capped automatically; the total number of agents per run is capped at 1000. + +Default to pipeline() — only use a barrier (parallel between stages) when stage N genuinely needs all of stage N-1's results at once (dedup/merge, early-exit on zero, cross-item comparison). + +Quality patterns to compose as the task warrants: +- Adversarial verify: spawn independent skeptics per finding, each prompted to REFUTE; keep only findings that survive a majority. +- Judge panel: generate N independent attempts from different angles, score with parallel judges, synthesize from the winner. +- Loop-until-dry: keep spawning finders until K consecutive rounds surface nothing new. +- Multi-modal sweep: parallel agents each searching a different way; each blind to the others. +- Completeness critic: a final agent that asks "what's missing?" — its answer becomes the next round of work. + +Scale effort to the request: a quick check needs a few agents and single-vote verification; "thoroughly audit this" warrants a larger finder pool plus a 3–5 vote adversarial pass and a synthesis stage.`; diff --git a/packages/landlord/test/workflow/tool.test.ts b/packages/landlord/test/workflow/tool.test.ts new file mode 100644 index 0000000..dc38cfe --- /dev/null +++ b/packages/landlord/test/workflow/tool.test.ts @@ -0,0 +1,44 @@ +import type { NormalizedResponse } from 'flint'; +import { execute } from 'flint'; +import { scriptedAdapter } from 'flint/testing'; +import { describe, expect, it } from 'vitest'; +import { WORKFLOW_TOOL_GUIDE, workflowTool } from '../../src/workflow/tool.ts'; + +function textResponse(content: string): NormalizedResponse { + return { + message: { role: 'assistant', content }, + usage: { input: 10, output: 5 }, + stopReason: 'end', + }; +} + +describe('workflowTool', () => { + it('runs a script supplied as tool input and returns runId + result', async () => { + const adapter = scriptedAdapter([textResponse('inner-result')]); + const tool = workflowTool({ adapter, models: { default: 'm' } }); + const res = await execute(tool, { + script: `export const meta = { name: 'x', description: 'y' }\nreturn await agent('go')`, + }); + expect(res.ok).toBe(true); + if (res.ok) { + const parsed = JSON.parse(res.value as string); + expect(parsed.result).toBe('inner-result'); + expect(typeof parsed.runId).toBe('string'); + } + }); + + it('errors clearly when neither script nor name is provided', async () => { + const tool = workflowTool({ adapter: scriptedAdapter([]), models: { default: 'm' } }); + const res = await execute(tool, {}); + expect(res.ok).toBe(true); + expect(String(res.ok ? res.value : '')).toMatch(/provide either/i); + }); +}); + +describe('WORKFLOW_TOOL_GUIDE', () => { + it('documents the core hooks', () => { + expect(WORKFLOW_TOOL_GUIDE).toMatch(/pipeline/); + expect(WORKFLOW_TOOL_GUIDE).toMatch(/parallel/); + expect(WORKFLOW_TOOL_GUIDE).toMatch(/schema/); + }); +}); From 708d0cbf3edfbcc904ff69aa11c6bf5eb9e0639d Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 17:13:10 -0600 Subject: [PATCH 19/22] style(landlord): clear biome lint errors in workflow modules MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Auto-fixed useLiteralKeys (bracket → dot notation), useTemplate (string concat → template literal), and formatting across src/test. Manually refactored skipWs and parseString in meta.ts to use local `pos` variable instead of reassigning the `i` parameter (noParameterAssign). Dropped unused `Result` type from the import in tool.ts. Co-Authored-By: Claude Sonnet 4.6 --- packages/landlord/src/workflow/agentcall.ts | 8 ++--- packages/landlord/src/workflow/meta.ts | 29 ++++++++++--------- packages/landlord/src/workflow/schema.ts | 2 +- packages/landlord/src/workflow/script.ts | 9 +++++- packages/landlord/src/workflow/tool.ts | 2 +- .../landlord/test/workflow/agentcall.test.ts | 12 ++++---- .../test/workflow/concurrency.test.ts | 2 +- packages/landlord/test/workflow/hooks.test.ts | 2 +- packages/landlord/test/workflow/meta.test.ts | 4 ++- .../landlord/test/workflow/runtime.test.ts | 2 +- .../landlord/test/workflow/sandbox.test.ts | 8 ++--- .../landlord/test/workflow/script.test.ts | 10 +++++-- 12 files changed, 51 insertions(+), 39 deletions(-) diff --git a/packages/landlord/src/workflow/agentcall.ts b/packages/landlord/src/workflow/agentcall.ts index 839b57b..640fd74 100644 --- a/packages/landlord/src/workflow/agentcall.ts +++ b/packages/landlord/src/workflow/agentcall.ts @@ -3,13 +3,13 @@ import { agent } from 'flint'; import type { ProviderAdapter } from 'flint'; import type { Budget } from 'flint/budget'; import { standardTools } from '../tools/index.ts'; -import type { AgentCounter, Semaphore } from './concurrency.ts'; import type { WorkflowBudget } from './budget.ts'; +import type { AgentCounter, Semaphore } from './concurrency.ts'; import { WorkflowError } from './errors.ts'; import type { EventEmitter } from './events.ts'; +import type { IsolationBackend } from './isolation.ts'; import { hashCall } from './journal.ts'; import type { JournalEntry, JournalStore } from './journal.ts'; -import type { IsolationBackend } from './isolation.ts'; import type { AgentTypeRegistry, WorkflowRegistry } from './registry.ts'; import { makeStructuredOutput } from './schema.ts'; import type { AgentOpts, Models } from './types.ts'; @@ -93,9 +93,7 @@ export async function runAgentCall( if (opts?.schema !== undefined) { const so = makeStructuredOutput(opts.schema); - const systemPrompt = - `${preset.systemPrompt}\n\nYou MUST call the structured_output tool exactly once with your ` + - 'final result as JSON matching the required schema. Do not finish until you have called it.'; + const systemPrompt = `${preset.systemPrompt}\n\nYou MUST call the structured_output tool exactly once with your final result as JSON matching the required schema. Do not finish until you have called it.`; const out = await agent({ adapter: deps.adapter, model, diff --git a/packages/landlord/src/workflow/meta.ts b/packages/landlord/src/workflow/meta.ts index cce4967..77cb9b5 100644 --- a/packages/landlord/src/workflow/meta.ts +++ b/packages/landlord/src/workflow/meta.ts @@ -2,21 +2,22 @@ import { MetaError } from './errors.ts'; import type { Meta } from './types.ts'; function skipWs(s: string, i: number): number { - while (i < s.length) { - const c = s[i]; - if (c === ' ' || c === '\n' || c === '\t' || c === '\r') i++; + let pos = i; + while (pos < s.length) { + const c = s[pos]; + if (c === ' ' || c === '\n' || c === '\t' || c === '\r') pos++; else break; } - return i; + return pos; } function parseString(s: string, i: number): { value: string; end: number } { const quote = s[i]; - i++; + let pos = i + 1; let out = ''; - while (i < s.length && s[i] !== quote) { - if (s[i] === '\\') { - const n = s[i + 1]; + while (pos < s.length && s[pos] !== quote) { + if (s[pos] === '\\') { + const n = s[pos + 1]; out += n === 'n' ? '\n' @@ -29,14 +30,14 @@ function parseString(s: string, i: number): { value: string; end: number } { : n === quote ? quote : (n ?? ''); - i += 2; + pos += 2; } else { - out += s[i]; - i++; + out += s[pos]; + pos++; } } - if (s[i] !== quote) throw new MetaError('Unterminated string in meta literal'); - return { value: out, end: i + 1 }; + if (s[pos] !== quote) throw new MetaError('Unterminated string in meta literal'); + return { value: out, end: pos + 1 }; } function parseNumber(s: string, i: number): { value: number; end: number } { @@ -112,7 +113,7 @@ export function parseMeta(source: string): Meta { throw new MetaError('meta must be an object literal'); } const meta = value as Record; - if (typeof meta['name'] !== 'string' || typeof meta['description'] !== 'string') { + if (typeof meta.name !== 'string' || typeof meta.description !== 'string') { throw new MetaError('meta requires string `name` and `description`'); } return meta as unknown as Meta; diff --git a/packages/landlord/src/workflow/schema.ts b/packages/landlord/src/workflow/schema.ts index 2054a21..2794070 100644 --- a/packages/landlord/src/workflow/schema.ts +++ b/packages/landlord/src/workflow/schema.ts @@ -31,7 +31,7 @@ export type StructuredOutput = { * message on mismatch so the agent loop retries. */ export function makeStructuredOutput(schema: Record): StructuredOutput { - const wrapped = schema['type'] !== 'object'; + const wrapped = schema.type !== 'object'; const jsonSchema: Record = wrapped ? { type: 'object', properties: { result: schema }, required: ['result'] } : schema; diff --git a/packages/landlord/src/workflow/script.ts b/packages/landlord/src/workflow/script.ts index cd8f1dc..0ed0a95 100644 --- a/packages/landlord/src/workflow/script.ts +++ b/packages/landlord/src/workflow/script.ts @@ -35,7 +35,14 @@ export function compileScript(source: string): WorkflowModule { const fn = new AsyncFunction(...hookNames, ...sandboxNames, body); const run = (wf: WorkflowContext): Promise => fn.apply(SANDBOX_THIS, [ - wf.agent, wf.parallel, wf.pipeline, wf.phase, wf.log, wf.args, wf.budget, wf.workflow, + wf.agent, + wf.parallel, + wf.pipeline, + wf.phase, + wf.log, + wf.args, + wf.budget, + wf.workflow, ...sandboxValues, ]); return { meta, run }; diff --git a/packages/landlord/src/workflow/tool.ts b/packages/landlord/src/workflow/tool.ts index df2a7df..4dec2fe 100644 --- a/packages/landlord/src/workflow/tool.ts +++ b/packages/landlord/src/workflow/tool.ts @@ -1,5 +1,5 @@ import { agent, tool } from 'flint'; -import type { ProviderAdapter, Result, Tool } from 'flint'; +import type { ProviderAdapter, Tool } from 'flint'; import { budget as makeBudget } from 'flint/budget'; import type { Budget } from 'flint/budget'; import { z } from 'zod'; diff --git a/packages/landlord/test/workflow/agentcall.test.ts b/packages/landlord/test/workflow/agentcall.test.ts index 0f40d3e..e6f381b 100644 --- a/packages/landlord/test/workflow/agentcall.test.ts +++ b/packages/landlord/test/workflow/agentcall.test.ts @@ -1,19 +1,19 @@ -// test/workflow/agentcall.test.ts -import type { NormalizedResponse } from 'flint'; -import { budget as makeBudget } from 'flint/budget'; import { mkdtemp } from 'node:fs/promises'; import { tmpdir } from 'node:os'; import { join } from 'node:path'; +// test/workflow/agentcall.test.ts +import type { NormalizedResponse } from 'flint'; +import { budget as makeBudget } from 'flint/budget'; import { mockAdapter, scriptedAdapter } from 'flint/testing'; import { describe, expect, it } from 'vitest'; +import { runAgentCall } from '../../src/workflow/agentcall.ts'; +import type { RunDeps } from '../../src/workflow/agentcall.ts'; import { WorkflowBudget } from '../../src/workflow/budget.ts'; import { AgentCounter, Semaphore } from '../../src/workflow/concurrency.ts'; import { EventEmitter } from '../../src/workflow/events.ts'; +import { workdirIsolation } from '../../src/workflow/isolation.ts'; import { memoryJournalStore } from '../../src/workflow/journal.ts'; import { createAgentRegistry } from '../../src/workflow/registry.ts'; -import { workdirIsolation } from '../../src/workflow/isolation.ts'; -import { runAgentCall } from '../../src/workflow/agentcall.ts'; -import type { RunDeps } from '../../src/workflow/agentcall.ts'; function textResponse(content: string): NormalizedResponse { return { diff --git a/packages/landlord/test/workflow/concurrency.test.ts b/packages/landlord/test/workflow/concurrency.test.ts index 0843b46..8df5169 100644 --- a/packages/landlord/test/workflow/concurrency.test.ts +++ b/packages/landlord/test/workflow/concurrency.test.ts @@ -1,7 +1,7 @@ // test/workflow/concurrency.test.ts import { describe, expect, it } from 'vitest'; -import { AgentCapError } from '../../src/workflow/errors.ts'; import { AgentCounter, Semaphore, defaultConcurrency } from '../../src/workflow/concurrency.ts'; +import { AgentCapError } from '../../src/workflow/errors.ts'; describe('Semaphore', () => { it('never runs more than `limit` tasks concurrently', async () => { diff --git a/packages/landlord/test/workflow/hooks.test.ts b/packages/landlord/test/workflow/hooks.test.ts index 230a562..23abb96 100644 --- a/packages/landlord/test/workflow/hooks.test.ts +++ b/packages/landlord/test/workflow/hooks.test.ts @@ -1,9 +1,9 @@ // test/workflow/hooks.test.ts import { describe, expect, it } from 'vitest'; +import type { RunDeps } from '../../src/workflow/agentcall.ts'; import { WorkflowBudget } from '../../src/workflow/budget.ts'; import { EventEmitter } from '../../src/workflow/events.ts'; import { buildContext } from '../../src/workflow/hooks.ts'; -import type { RunDeps } from '../../src/workflow/agentcall.ts'; function fakeDeps(): RunDeps { return { diff --git a/packages/landlord/test/workflow/meta.test.ts b/packages/landlord/test/workflow/meta.test.ts index 7007f80..10f21b9 100644 --- a/packages/landlord/test/workflow/meta.test.ts +++ b/packages/landlord/test/workflow/meta.test.ts @@ -13,7 +13,9 @@ describe('parseMeta', () => { }); it('rejects a non-literal value (function call) in meta', () => { - expect(() => parseMeta(`export const meta = { name: foo(), description: 'x' }`)).toThrow(MetaError); + expect(() => parseMeta(`export const meta = { name: foo(), description: 'x' }`)).toThrow( + MetaError, + ); }); it('rejects meta missing name/description', () => { diff --git a/packages/landlord/test/workflow/runtime.test.ts b/packages/landlord/test/workflow/runtime.test.ts index f7cd0b3..f4ed13b 100644 --- a/packages/landlord/test/workflow/runtime.test.ts +++ b/packages/landlord/test/workflow/runtime.test.ts @@ -2,9 +2,9 @@ import type { NormalizedResponse } from 'flint'; import { mockAdapter, scriptedAdapter } from 'flint/testing'; import { describe, expect, it } from 'vitest'; +import { defineWorkflow } from '../../src/workflow/define.ts'; import { memoryJournalStore } from '../../src/workflow/journal.ts'; import { runWorkflow, runWorkflowScript } from '../../src/workflow/runtime.ts'; -import { defineWorkflow } from '../../src/workflow/define.ts'; function textResponse(content: string): NormalizedResponse { return { diff --git a/packages/landlord/test/workflow/sandbox.test.ts b/packages/landlord/test/workflow/sandbox.test.ts index dbc9ce4..4de2fe9 100644 --- a/packages/landlord/test/workflow/sandbox.test.ts +++ b/packages/landlord/test/workflow/sandbox.test.ts @@ -4,11 +4,11 @@ import { sandboxBindings } from '../../src/workflow/sandbox.ts'; describe('sandboxBindings', () => { it('blocks Date, Math.random, and process but allows pure Math', () => { const b = sandboxBindings(); - const D = b['Date'] as { now: () => number }; - const M = b['Math'] as Math; - const P = b['process'] as { cwd: () => string }; + const D = b.Date as { now: () => number }; + const M = b.Math as Math; + const P = b.process as { cwd: () => string }; expect(() => D.now()).toThrow(); - expect(() => new (b['Date'] as unknown as new () => unknown)()).toThrow(); + expect(() => new (b.Date as unknown as new () => unknown)()).toThrow(); expect(() => M.random()).toThrow(); expect(M.floor(3.7)).toBe(3); expect(() => P.cwd()).toThrow(); diff --git a/packages/landlord/test/workflow/script.test.ts b/packages/landlord/test/workflow/script.test.ts index 18d2e22..62cea9e 100644 --- a/packages/landlord/test/workflow/script.test.ts +++ b/packages/landlord/test/workflow/script.test.ts @@ -1,8 +1,8 @@ // test/workflow/script.test.ts import { describe, expect, it } from 'vitest'; -import { compileScript } from '../../src/workflow/script.ts'; import { defineWorkflow } from '../../src/workflow/define.ts'; import { MetaError } from '../../src/workflow/errors.ts'; +import { compileScript } from '../../src/workflow/script.ts'; import type { WorkflowContext } from '../../src/workflow/types.ts'; function fakeCtx(calls: string[]): WorkflowContext { @@ -34,7 +34,9 @@ describe('compileScript', () => { }); it('blocks nondeterministic globals at runtime', async () => { - const mod = compileScript(`export const meta = { name: 'a', description: 'b' }\nreturn Date.now()`); + const mod = compileScript( + `export const meta = { name: 'a', description: 'b' }\nreturn Date.now()`, + ); await expect(mod.run(fakeCtx([]))).rejects.toThrow(); }); @@ -50,6 +52,8 @@ describe('defineWorkflow', () => { it('returns the module and validates meta', () => { const mod = defineWorkflow({ meta: { name: 'm', description: 'd' }, run: async () => 42 }); expect(mod.meta.name).toBe('m'); - expect(() => defineWorkflow({ meta: { name: 'm' } as never, run: async () => 1 })).toThrow(MetaError); + expect(() => defineWorkflow({ meta: { name: 'm' } as never, run: async () => 1 })).toThrow( + MetaError, + ); }); }); From a7345ff41e3ddc2d359845eb9dc47ab7443c5751 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 17:16:20 -0600 Subject: [PATCH 20/22] feat(landlord): rebuild orchestrate on runtime; export workflow surface --- packages/landlord/package.json | 3 +- packages/landlord/src/index.ts | 1 + packages/landlord/src/orchestrate.ts | 187 ++++++++++++------------ packages/landlord/src/workflow/index.ts | 35 +++++ packages/landlord/tsup.config.ts | 2 +- 5 files changed, 135 insertions(+), 93 deletions(-) create mode 100644 packages/landlord/src/workflow/index.ts diff --git a/packages/landlord/package.json b/packages/landlord/package.json index 6885c1e..12c54bc 100644 --- a/packages/landlord/package.json +++ b/packages/landlord/package.json @@ -9,7 +9,8 @@ "files": ["dist", "README.md", "LICENSE"], "exports": { ".": { "types": "./dist/index.d.ts", "import": "./dist/index.js" }, - "./tools": { "types": "./dist/tools/index.d.ts", "import": "./dist/tools/index.js" } + "./tools": { "types": "./dist/tools/index.d.ts", "import": "./dist/tools/index.js" }, + "./workflow": { "types": "./dist/workflow/index.d.ts", "import": "./dist/workflow/index.js" } }, "scripts": { "build": "tsup", diff --git a/packages/landlord/src/index.ts b/packages/landlord/src/index.ts index a2d0f48..cc49063 100644 --- a/packages/landlord/src/index.ts +++ b/packages/landlord/src/index.ts @@ -11,3 +11,4 @@ export type { TenantOutcome, } from './orchestrate.ts'; export type { ValidationVerdict } from './validate.ts'; +export * from './workflow/index.ts'; diff --git a/packages/landlord/src/orchestrate.ts b/packages/landlord/src/orchestrate.ts index 9ad9ae8..f331458 100644 --- a/packages/landlord/src/orchestrate.ts +++ b/packages/landlord/src/orchestrate.ts @@ -6,6 +6,8 @@ import type { Budget } from 'flint/budget'; import type { Contract } from './contract.ts'; import { decompose } from './decompose.ts'; import { runTenant } from './tenant.ts'; +import { defineWorkflow } from './workflow/define.ts'; +import { runWorkflow } from './workflow/runtime.ts'; export class DependencyCycleError extends Error { constructor(message: string) { @@ -75,7 +77,6 @@ export async function orchestrate( toolsFactory: (workDir: string) => Tool[], config: OrchestratorConfig, ): Promise> { - // Decompose const decomposeResult = await decompose(prompt, { adapter: config.adapter, model: config.landlordModel, @@ -93,103 +94,107 @@ export async function orchestrate( const baseOutputDir = config.outputDir ?? join(tmpdir(), `landlord-${Date.now()}`); await mkdir(join(baseOutputDir, 'shared'), { recursive: true }); - // Per-role gate: resolves with artifacts when the tenant completes - const gates = new Map< - string, - { promise: Promise>; resolve: (v: Record) => void } - >(); - for (const c of plan) { - let resolve!: (v: Record) => void; - const promise = new Promise>((r) => { - resolve = r; - }); - gates.set(c.role, { promise, resolve }); - } + const module = defineWorkflow({ + meta: { name: 'auto-decompose', description: 'Landlord auto-decomposition orchestration' }, + run: async (wf): Promise => { + const gates = new Map< + string, + { promise: Promise>; resolve: (v: Record) => void } + >(); + for (const c of plan) { + let resolve!: (v: Record) => void; + const promise = new Promise>((r) => { + resolve = r; + }); + gates.set(c.role, { promise, resolve }); + } - const escalatedRoles = new Set(); - const tenantOutcomes: Record = {}; - const jobArtifacts: Record> = {}; + const escalatedRoles = new Set(); + const tenantOutcomes: Record = {}; + const jobArtifacts: Record> = {}; + + async function runWithRetry(contract: Contract): Promise { + for (const dep of contract.dependsOn) { + await gates.get(dep)?.promise; + if (escalatedRoles.has(dep)) { + const lastError = `Dependency '${dep}' escalated before this tenant could start`; + escalatedRoles.add(contract.role); + tenantOutcomes[contract.role] = { status: 'escalated', lastError, retriesExhausted: 0 }; + gates.get(contract.role)?.resolve({}); + config.onEvent?.({ type: 'tenant_escalated', role: contract.role }); + return; + } + } + + const sharedArtifacts: Record = {}; + for (const dep of contract.dependsOn) { + const depArtifacts = jobArtifacts[dep] ?? {}; + for (const [k, v] of Object.entries(depArtifacts)) { + sharedArtifacts[`${dep}.${k}`] = v; + } + } + + const workDir = join(baseOutputDir, contract.role); + await mkdir(workDir, { recursive: true }); + config.onEvent?.({ type: 'tenant_started', role: contract.role }); + + let lastError: string | undefined; + for (let attempt = 0; attempt < contract.maxRetries; attempt++) { + const result = await runTenant( + contract, + toolsFactory(workDir), + { + adapter: config.adapter, + model: config.tenantModel, + ...(config.budget !== undefined ? { budget: config.budget } : {}), + workDir, + }, + lastError, + Object.keys(sharedArtifacts).length > 0 ? sharedArtifacts : undefined, + ); + + if (result.ok) { + jobArtifacts[contract.role] = result.value; + tenantOutcomes[contract.role] = { status: 'complete', artifacts: result.value }; + gates.get(contract.role)?.resolve(result.value); + config.onEvent?.({ type: 'tenant_complete', role: contract.role }); + return; + } + + lastError = result.error.message; + config.onEvent?.({ + type: 'tenant_evicted', + role: contract.role, + reason: lastError, + retry: attempt + 1, + }); + } - async function runWithRetry(contract: Contract): Promise { - // Wait for dependencies - for (const dep of contract.dependsOn) { - await gates.get(dep)?.promise; - if (escalatedRoles.has(dep)) { - const lastError = `Dependency '${dep}' escalated before this tenant could start`; escalatedRoles.add(contract.role); - tenantOutcomes[contract.role] = { status: 'escalated', lastError, retriesExhausted: 0 }; + tenantOutcomes[contract.role] = { + status: 'escalated', + lastError: lastError ?? 'unknown', + retriesExhausted: contract.maxRetries, + }; gates.get(contract.role)?.resolve({}); config.onEvent?.({ type: 'tenant_escalated', role: contract.role }); - return; - } - } - - // Build shared context from dependencies - const sharedArtifacts: Record = {}; - for (const dep of contract.dependsOn) { - const depArtifacts = jobArtifacts[dep] ?? {}; - for (const [k, v] of Object.entries(depArtifacts)) { - sharedArtifacts[`${dep}.${k}`] = v; - } - } - - const workDir = join(baseOutputDir, contract.role); - await mkdir(workDir, { recursive: true }); - - config.onEvent?.({ type: 'tenant_started', role: contract.role }); - - let lastError: string | undefined; - - for (let attempt = 0; attempt < contract.maxRetries; attempt++) { - const result = await runTenant( - contract, - toolsFactory(workDir), - { - adapter: config.adapter, - model: config.tenantModel, - ...(config.budget !== undefined ? { budget: config.budget } : {}), - workDir, - }, - lastError, - Object.keys(sharedArtifacts).length > 0 ? sharedArtifacts : undefined, - ); - - if (result.ok) { - jobArtifacts[contract.role] = result.value; - tenantOutcomes[contract.role] = { status: 'complete', artifacts: result.value }; - gates.get(contract.role)?.resolve(result.value); - config.onEvent?.({ type: 'tenant_complete', role: contract.role }); - return; } - lastError = result.error.message; - config.onEvent?.({ - type: 'tenant_evicted', - role: contract.role, - reason: lastError, - retry: attempt + 1, - }); - } - - // All retries exhausted - escalatedRoles.add(contract.role); - tenantOutcomes[contract.role] = { - status: 'escalated', - lastError: lastError ?? 'unknown', - retriesExhausted: contract.maxRetries, - }; - gates.get(contract.role)?.resolve({}); - config.onEvent?.({ type: 'tenant_escalated', role: contract.role }); - } - - await Promise.all(plan.map((c) => runWithRetry(c))); + await wf.parallel(plan.map((c) => () => runWithRetry(c))); - const allComplete = Object.values(tenantOutcomes).every((o) => o.status === 'complete'); - const status = allComplete ? 'complete' : 'partial'; - config.onEvent?.({ type: 'job_complete', artifacts: jobArtifacts }); + const allComplete = Object.values(tenantOutcomes).every((o) => o.status === 'complete'); + const status: 'complete' | 'partial' = allComplete ? 'complete' : 'partial'; + config.onEvent?.({ type: 'job_complete', artifacts: jobArtifacts }); + return { status, tenants: tenantOutcomes, artifacts: jobArtifacts }; + }, + }); - return { - ok: true, - value: { status, tenants: tenantOutcomes, artifacts: jobArtifacts }, - }; + const runResult = await runWorkflow(module, { + adapter: config.adapter, + models: { default: config.tenantModel }, + ...(config.budget !== undefined ? { budget: config.budget } : {}), + baseDir: baseOutputDir, + }); + if (!runResult.ok) return runResult; + return { ok: true, value: runResult.value.result as OrchestrateResult }; } diff --git a/packages/landlord/src/workflow/index.ts b/packages/landlord/src/workflow/index.ts new file mode 100644 index 0000000..5d04629 --- /dev/null +++ b/packages/landlord/src/workflow/index.ts @@ -0,0 +1,35 @@ +export { runWorkflow, runWorkflowScript } from './runtime.ts'; +export type { RuntimeConfig } from './runtime.ts'; +export { defineWorkflow } from './define.ts'; +export { compileScript, stripModuleSyntax } from './script.ts'; +export { parseMeta, parseLiteral } from './meta.ts'; +export { sandboxBindings } from './sandbox.ts'; +export { workflowTool, orchestratorAgent, WORKFLOW_TOOL_GUIDE } from './tool.ts'; +export type { WorkflowToolConfig } from './tool.ts'; +export { + createAgentRegistry, + createWorkflowRegistry, + BUILT_IN_AGENT_TYPES, +} from './registry.ts'; +export type { AgentType, AgentTypeRegistry, WorkflowRegistry } from './registry.ts'; +export { memoryJournalStore, fileJournalStore, hashCall } from './journal.ts'; +export type { JournalEntry, JournalStore } from './journal.ts'; +export { workdirIsolation, gitWorktreeIsolation } from './isolation.ts'; +export type { IsolationBackend, IsolationLease } from './isolation.ts'; +export { WorkflowBudget, budgetView } from './budget.ts'; +export { Semaphore, AgentCounter, defaultConcurrency } from './concurrency.ts'; +export { EventEmitter } from './events.ts'; +export type { EventSink } from './events.ts'; +export { WorkflowError, AgentCapError, MetaError } from './errors.ts'; +export type { + AgentOpts, + Meta, + MetaPhase, + Models, + StageFn, + WorkflowBudgetView, + WorkflowContext, + WorkflowEvent, + WorkflowModule, + WorkflowRunResult, +} from './types.ts'; diff --git a/packages/landlord/tsup.config.ts b/packages/landlord/tsup.config.ts index 18c60ac..203e022 100644 --- a/packages/landlord/tsup.config.ts +++ b/packages/landlord/tsup.config.ts @@ -2,7 +2,7 @@ import { defineConfig } from 'tsup'; // biome-ignore lint/style/noDefaultExport: tsup config requires default export export default defineConfig({ - entry: ['src/index.ts', 'src/tools/index.ts'], + entry: ['src/index.ts', 'src/tools/index.ts', 'src/workflow/index.ts'], format: ['esm'], dts: true, clean: true, From 3813912cbd5c913bd815e5e976583cb5726aa956 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 17:26:07 -0600 Subject: [PATCH 21/22] docs(landlord): dynamic-workflow runtime docs, example, and changeset Add 6 new docs pages (workflow, hooks, resume, agent-types, isolation, workflow-tool), a dynamic-workflow example, sidebar/nav updates, README mention of the workflow runtime, and a minor changeset entry. Co-Authored-By: Claude Opus 4.8 (1M context) --- .changeset/landlord-dynamic-workflows.md | 5 + README.md | 3 +- docs/.vitepress/config.ts | 7 + docs/examples/dynamic-workflow.md | 319 +++++++++++++++++++++++ docs/landlord/agent-types.md | 169 ++++++++++++ docs/landlord/hooks.md | 232 +++++++++++++++++ docs/landlord/index.md | 45 ++++ docs/landlord/isolation.md | 120 +++++++++ docs/landlord/orchestrate.md | 4 + docs/landlord/resume.md | 163 ++++++++++++ docs/landlord/workflow-tool.md | 194 ++++++++++++++ docs/landlord/workflow.md | 201 ++++++++++++++ 12 files changed, 1461 insertions(+), 1 deletion(-) create mode 100644 .changeset/landlord-dynamic-workflows.md create mode 100644 docs/examples/dynamic-workflow.md create mode 100644 docs/landlord/agent-types.md create mode 100644 docs/landlord/hooks.md create mode 100644 docs/landlord/isolation.md create mode 100644 docs/landlord/resume.md create mode 100644 docs/landlord/workflow-tool.md create mode 100644 docs/landlord/workflow.md diff --git a/.changeset/landlord-dynamic-workflows.md b/.changeset/landlord-dynamic-workflows.md new file mode 100644 index 0000000..cc41851 --- /dev/null +++ b/.changeset/landlord-dynamic-workflows.md @@ -0,0 +1,5 @@ +--- +"landlord": minor +--- + +Add a dynamic-workflow runtime: author workflows as typed functions (`defineWorkflow`) or model-written JS scripts (`runWorkflowScript`) that orchestrate subagents via `agent`/`parallel`/`pipeline`/`phase`/`log`/`args`/`budget`/`workflow` hooks, with structured-output schemas, concurrency/agent caps, resume/journaling, a determinism sandbox, an agent-type registry, isolation backends, and a model-facing `workflowTool`. `orchestrate()` is now built on this runtime (API unchanged). diff --git a/README.md b/README.md index b820b3f..b1f7e8e 100644 --- a/README.md +++ b/README.md @@ -168,6 +168,7 @@ if (out.ok) console.log(out.value.message.content); // "579" | `@flint/adapter-anthropic` | Anthropic Messages API — prompt-cache aware | | `@flint/adapter-openai-compat` | Any OpenAI-compatible endpoint | | `@flint/graph` | State-machine agent workflows | +| `@flint/landlord` | Multi-agent orchestration: dynamic workflow runtime (ultracode-style script orchestration) and auto-decompose `orchestrate()` | ## Flint vs LangChain @@ -303,7 +304,7 @@ Full documentation at **[dizzymii.github.io/Flint](https://dizzymii.github.io/Fl - [Features](https://dizzymii.github.io/Flint/features/budget) — budget, compress, memory, RAG, recipes, safety, graph - [Adapters](https://dizzymii.github.io/Flint/adapters/anthropic) — Anthropic, OpenAI-compatible, custom - [Examples](https://dizzymii.github.io/Flint/examples/basic-call) — basic call, tools, agent, streaming, RAG, multi-agent, memory, graph -- [Landlord](https://dizzymii.github.io/Flint/landlord/) — `@flint/landlord` multi-agent orchestration +- [Landlord](https://dizzymii.github.io/Flint/landlord/) — `@flint/landlord` dynamic workflow runtime and multi-agent orchestration - [Reference](https://dizzymii.github.io/Flint/reference/errors) — error types catalog ## Contributing diff --git a/docs/.vitepress/config.ts b/docs/.vitepress/config.ts index a28c7b8..b66f118 100644 --- a/docs/.vitepress/config.ts +++ b/docs/.vitepress/config.ts @@ -100,6 +100,7 @@ export default defineConfig({ { text: 'Tool Approval', link: '/examples/tool-approval' }, { text: 'Memory Agent', link: '/examples/memory-agent' }, { text: 'Graph Workflow', link: '/examples/graph-workflow' }, + { text: 'Dynamic Workflow', link: '/examples/dynamic-workflow' }, ], }, ], @@ -108,6 +109,12 @@ export default defineConfig({ text: 'Landlord', items: [ { text: 'Overview', link: '/landlord/' }, + { text: 'Workflows', link: '/landlord/workflow' }, + { text: 'Hooks', link: '/landlord/hooks' }, + { text: 'Resume', link: '/landlord/resume' }, + { text: 'Agent Types', link: '/landlord/agent-types' }, + { text: 'Isolation', link: '/landlord/isolation' }, + { text: 'Workflow Tool', link: '/landlord/workflow-tool' }, { text: 'Contracts', link: '/landlord/contract' }, { text: 'decompose()', link: '/landlord/decompose' }, { text: 'orchestrate()', link: '/landlord/orchestrate' }, diff --git a/docs/examples/dynamic-workflow.md b/docs/examples/dynamic-workflow.md new file mode 100644 index 0000000..380a5ea --- /dev/null +++ b/docs/examples/dynamic-workflow.md @@ -0,0 +1,319 @@ +# Dynamic Workflow: Review and Verify Pipeline + +This example implements a two-phase security review pipeline using the `@flint/landlord` workflow runtime. It shows the same workflow written two ways: as a string script (for model-authored workflows) and as a typed `defineWorkflow` (for production code). + +## What this demonstrates + +- `runWorkflowScript` — executing a model-authored JS string +- `defineWorkflow` + `runWorkflow` — the typed authoring path +- `parallel` for a barrier gather, `pipeline` for per-item multi-stage processing +- `schema` for structured output per agent +- `onEvent` progress logging +- `fileJournalStore` for crash-safe resume + +## Setup + +```ts +import { + defineWorkflow, + fileJournalStore, + runWorkflow, + runWorkflowScript, +} from '@flint/landlord'; +import { anthropicAdapter } from '@flint/adapter-anthropic'; +import { join } from 'node:path'; + +const adapter = anthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }); +const journal = fileJournalStore(join(process.cwd(), '.workflow-journal')); + +function onEvent(e: import('@flint/landlord').WorkflowEvent) { + switch (e.type) { + case 'phase_started': + console.log(`\n=== ${e.title} ===`); + break; + case 'agent_started': + console.log(` → ${e.label} [${e.agentType}] (${e.model})`); + break; + case 'agent_complete': + console.log(` ✓ ${e.label} (${e.tokens} tokens)`); + break; + case 'agent_error': + console.error(` ✗ ${e.label}: ${e.error}`); + break; + case 'workflow_complete': + console.log('\n[workflow complete]'); + break; + } +} +``` + +## Version 1: string script + +The same logic as a model-authored JS string. This is the format the model writes when using `workflowTool`. + +```ts +const source = ` +export const meta = { + name: 'security-review', + description: 'Scan files for issues, then verify each finding independently', + phases: [ + { title: 'Scan', detail: 'Parallel scan per file' }, + { title: 'Verify', detail: 'Independent verification per finding' } + ] +} + +const files = args + +// Phase 1: scan all files in parallel (barrier — we need all findings before verifying) +phase('Scan') +const FINDING_SCHEMA = { + type: 'object', + properties: { + file: { type: 'string' }, + issues: { type: 'array', items: { type: 'string' } }, + severity: { type: 'string', enum: ['low', 'medium', 'high', 'critical'] } + }, + required: ['file', 'issues', 'severity'] +} + +const rawFindings = await parallel( + files.map(f => () => agent('Scan ' + f + ' for security vulnerabilities', { + label: 'scan:' + f, + agentType: 'code-reviewer', + schema: FINDING_SCHEMA + })) +) + +const findings = rawFindings.filter(Boolean) +log('Found ' + findings.length + ' scan results') + +if (findings.length === 0) { + return { findings: [], verified: [] } +} + +// Phase 2: verify each finding independently, no barrier needed between items +phase('Verify') +const VERIFY_SCHEMA = { + type: 'object', + properties: { + confirmed: { type: 'boolean' }, + reason: { type: 'string' }, + severity: { type: 'string', enum: ['low', 'medium', 'high', 'critical'] } + }, + required: ['confirmed', 'reason', 'severity'] +} + +const verified = await pipeline( + findings, + (finding) => agent( + 'You are an independent security reviewer. Verify this finding — is it a real vulnerability, ' + + 'or a false positive? Be skeptical. Finding: ' + JSON.stringify(finding), + { + label: 'verify:' + finding.file, + agentType: 'code-reviewer', + schema: VERIFY_SCHEMA + } + ) +) + +const confirmed = verified.filter(v => v?.confirmed) +log('Confirmed ' + confirmed.length + ' of ' + findings.length + ' findings') + +return { + findings, + verified, + confirmed +} +`; + +const files = ['src/auth.ts', 'src/api.ts', 'src/db.ts']; + +const result1 = await runWorkflowScript(source, { + adapter, + models: { default: 'claude-opus-4-7' }, + args: files, + journal, + runId: 'review-001', + onEvent, +}); + +if (result1.ok) { + const { findings, confirmed } = result1.value.result as { + findings: unknown[]; + confirmed: unknown[]; + }; + console.log(`\nTotal findings: ${findings.length}`); + console.log(`Confirmed vulnerabilities: ${confirmed.length}`); + console.log('runId:', result1.value.runId); +} +``` + +## Version 2: typed workflow + +The identical logic using `defineWorkflow` — fully type-checked, no eval. + +```ts +type Finding = { + file: string; + issues: string[]; + severity: 'low' | 'medium' | 'high' | 'critical'; +}; + +type Verification = { + confirmed: boolean; + reason: string; + severity: 'low' | 'medium' | 'high' | 'critical'; +}; + +const FINDING_SCHEMA = { + type: 'object', + properties: { + file: { type: 'string' }, + issues: { type: 'array', items: { type: 'string' } }, + severity: { type: 'string', enum: ['low', 'medium', 'high', 'critical'] }, + }, + required: ['file', 'issues', 'severity'], +} as const; + +const VERIFY_SCHEMA = { + type: 'object', + properties: { + confirmed: { type: 'boolean' }, + reason: { type: 'string' }, + severity: { type: 'string', enum: ['low', 'medium', 'high', 'critical'] }, + }, + required: ['confirmed', 'reason', 'severity'], +} as const; + +const reviewWorkflow = defineWorkflow({ + meta: { + name: 'security-review', + description: 'Scan files for issues, then verify each finding independently', + phases: [ + { title: 'Scan', detail: 'Parallel scan per file' }, + { title: 'Verify', detail: 'Independent verification per finding' }, + ], + }, + + run: async (wf) => { + const files = wf.args as string[]; + + // Phase 1: scan all files in parallel — barrier because we want all findings before verifying + wf.phase('Scan'); + const rawFindings = await wf.parallel( + files.map((f) => () => + wf.agent(`Scan ${f} for security vulnerabilities`, { + label: `scan:${f}`, + agentType: 'code-reviewer', + schema: FINDING_SCHEMA, + }), + ), + ); + + const findings = rawFindings.filter((f): f is Finding => f !== null); + wf.log(`Found ${findings.length} scan results`); + + if (findings.length === 0) { + return { findings: [], verified: [], confirmed: [] }; + } + + // Phase 2: verify each finding — pipeline because each item is independent + wf.phase('Verify'); + const verified = await wf.pipeline( + findings, + (finding) => + wf.agent( + `You are an independent security reviewer. Verify this finding — is it a real ` + + `vulnerability, or a false positive? Be skeptical. Finding: ${JSON.stringify(finding)}`, + { + label: `verify:${(finding as Finding).file}`, + agentType: 'code-reviewer', + schema: VERIFY_SCHEMA, + }, + ), + ); + + const confirmed = (verified as Array).filter( + (v): v is Verification => v?.confirmed === true, + ); + wf.log(`Confirmed ${confirmed.length} of ${findings.length} findings`); + + return { findings, verified, confirmed }; + }, +}); + +const result2 = await runWorkflow(reviewWorkflow, { + adapter, + models: { default: 'claude-opus-4-7' }, + args: ['src/auth.ts', 'src/api.ts', 'src/db.ts'], + journal, + runId: 'review-002', + onEvent, +}); + +if (result2.ok) { + const output = result2.value.result as { + findings: Finding[]; + confirmed: Verification[]; + }; + console.log(`\nTotal findings: ${output.findings.length}`); + console.log(`Confirmed vulnerabilities: ${output.confirmed.length}`); + + for (const v of output.confirmed) { + console.log(` [${v.severity}] ${v.reason}`); + } +} +``` + +## Resume after a crash + +If the run crashes halfway (e.g. network error during the Verify phase), you can resume without re-running the Scan phase: + +```ts +const resumed = await runWorkflow(reviewWorkflow, { + adapter, + models: { default: 'claude-opus-4-7' }, + args: ['src/auth.ts', 'src/api.ts', 'src/db.ts'], + journal, + runId: 'review-003', + resumeFromRunId: 'review-002', // replay unchanged prefix from this run + onEvent, +}); +``` + +The Scan agents whose calls are journaled will be replayed instantly. The first Verify agent that didn't complete will re-run live, and all subsequent agents will run live too. + +## Expected output + +``` +=== Scan === + → scan:src/auth.ts [code-reviewer] (claude-opus-4-7) + → scan:src/api.ts [code-reviewer] (claude-opus-4-7) + → scan:src/db.ts [code-reviewer] (claude-opus-4-7) + ✓ scan:src/auth.ts (1240 tokens) + ✓ scan:src/api.ts (980 tokens) + ✓ scan:src/db.ts (1105 tokens) + +=== Verify === + → verify:src/auth.ts [code-reviewer] (claude-opus-4-7) + ✓ verify:src/auth.ts (850 tokens) + → verify:src/api.ts [code-reviewer] (claude-opus-4-7) + ✓ verify:src/api.ts (720 tokens) + → verify:src/db.ts [code-reviewer] (claude-opus-4-7) + ✓ verify:src/db.ts (910 tokens) + +[workflow complete] + +Total findings: 3 +Confirmed vulnerabilities: 2 + [high] SQL query in db.ts line 42 uses string concatenation — SQL injection risk + [medium] auth.ts token expiry not validated on refresh path +``` + +## See also + +- [Workflow Runtime](/landlord/workflow) — `RuntimeConfig`, `WorkflowEvent` +- [Hooks reference](/landlord/hooks) — `parallel` vs `pipeline`, `schema` for structured output +- [Resume and journaling](/landlord/resume) — how the journal replay works +- [Agent Types](/landlord/agent-types) — `code-reviewer` and other built-in presets +- [Multi-Agent with Landlord](/examples/multi-agent) — the `orchestrate()` equivalent diff --git a/docs/landlord/agent-types.md b/docs/landlord/agent-types.md new file mode 100644 index 0000000..fac12b5 --- /dev/null +++ b/docs/landlord/agent-types.md @@ -0,0 +1,169 @@ +# Agent Types + +Every `agent()` call resolves an **agent type** — a preset that supplies a system prompt, a default tool set, and an optional model override. The `agentType` field on `AgentOpts` selects the preset; `'default'` is used when unset. + +## Type definitions + +```ts +type AgentType = { + systemPrompt: string; + tools?: (workDir: string) => Tool[]; + model?: string; +}; + +type AgentTypeRegistry = { + resolve(name: string): AgentType; + has(name: string): boolean; +}; +``` + +## Built-in types + +| Name | Tools | System prompt focus | +|------|-------|---------------------| +| `'default'` | `standardTools(workDir)` (file read/write, bash, web fetch) | General worker; returns structured results via `structured_output` tool when a schema is requested | +| `'Explore'` | `fileReadTool(workDir)`, `webFetchTool(workDir)` | Read-only exploration; searches broadly, returns conclusions, never modifies files | +| `'code-reviewer'` | `fileReadTool(workDir)`, `bashTool(workDir)` | Code review; reports concrete issues with file and line references | + +All three are available from `BUILT_IN_AGENT_TYPES` if you need to inspect them: + +```ts +import { BUILT_IN_AGENT_TYPES } from '@flint/landlord'; + +console.log(BUILT_IN_AGENT_TYPES.Explore.systemPrompt); +``` + +## Using built-in types + +```ts +import { defineWorkflow, runWorkflow } from '@flint/landlord'; +import { anthropicAdapter } from '@flint/adapter-anthropic'; + +const adapter = anthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }); + +const workflow = defineWorkflow({ + meta: { name: 'review', description: 'Explore then review' }, + run: async (wf) => { + wf.phase('Explore'); + // Read-only agent — no write tools + const overview = await wf.agent('Map out the codebase structure', { + agentType: 'Explore', + }); + + wf.phase('Review'); + // Code reviewer — file read + bash tools, review-focused system prompt + const review = await wf.agent( + `Review the code described here: ${String(overview)}`, + { agentType: 'code-reviewer', label: 'code-review' }, + ); + + return { overview, review }; + }, +}); + +const result = await runWorkflow(workflow, { + adapter, + models: { default: 'claude-opus-4-7' }, +}); +``` + +## Composing `agentType` with `schema` + +When `schema` is set alongside `agentType`, the preset's system prompt is used and the structured-output instruction is appended to it. The preset's tools are used as the base tool set, and the forced `structured_output` tool is added on top. + +```ts +const findings = await wf.agent('Find security issues in src/auth.ts', { + agentType: 'code-reviewer', + schema: { + type: 'object', + properties: { + issues: { + type: 'array', + items: { + type: 'object', + properties: { + file: { type: 'string' }, + line: { type: 'number' }, + description: { type: 'string' }, + }, + required: ['file', 'description'], + }, + }, + }, + required: ['issues'], + }, +}); +// findings is { issues: Array<{ file: string; line?: number; description: string }> } +// The code-reviewer system prompt + structured_output instruction were used +``` + +## Custom agent types + +Pass a `Record` to `createAgentRegistry` to add or override types. Custom types are merged over the built-ins. + +```ts +import { + createAgentRegistry, + defineWorkflow, + runWorkflow, +} from '@flint/landlord'; +import { bashTool, fileReadTool } from '@flint/landlord/tools'; + +const registry = createAgentRegistry({ + 'security-auditor': { + systemPrompt: + 'You are a security auditor specializing in OWASP Top 10. ' + + 'Read code carefully. Report only confirmed vulnerabilities — no false positives. ' + + 'When returning structured output, always include severity and cve reference if known.', + tools: (workDir) => [fileReadTool(workDir), bashTool(workDir)], + model: 'claude-opus-4-7', // this type always uses Opus regardless of models.default + }, + 'doc-writer': { + systemPrompt: + 'You are a technical writer. Write clear, concise documentation. ' + + 'Use plain language and include runnable code examples.', + // No tools override — falls back to standardTools(workDir) + }, +}); + +const workflow = defineWorkflow({ + meta: { name: 'security-review', description: 'Security-focused review' }, + run: async (wf) => { + return wf.agent('Audit src/ for OWASP vulnerabilities', { + agentType: 'security-auditor', + }); + }, +}); + +await runWorkflow(workflow, { + adapter, + models: { default: 'claude-haiku-4-5' }, + registry, // security-auditor will still use claude-opus-4-7 due to its preset model +}); +``` + +## Per-agent model override + +`opts.model` takes priority over the preset's model which takes priority over `models.default`: + +``` +opts.model > preset.model > config.models.default +``` + +Use named tiers in `models` to manage fast/slow variants: + +```ts +await runWorkflow(workflow, { + adapter, + models: { default: 'claude-opus-4-7', fast: 'claude-haiku-4-5' }, +}); + +// Inside the workflow: +const quick = await wf.agent('Quick check', { model: 'claude-haiku-4-5' }); +``` + +## See also + +- [Hooks reference](/landlord/hooks) — `agent()` and `AgentOpts` full reference +- [Isolation](/landlord/isolation) — per-agent work directory backends +- [Workflow Runtime](/landlord/workflow) — `RuntimeConfig.registry` field diff --git a/docs/landlord/hooks.md b/docs/landlord/hooks.md new file mode 100644 index 0000000..178c5b1 --- /dev/null +++ b/docs/landlord/hooks.md @@ -0,0 +1,232 @@ +# Hooks Reference + +Every workflow receives a `WorkflowContext` — either as the `wf` parameter of `defineWorkflow`'s `run` function, or as injected globals in a string script. This page documents each hook exactly as it appears in the source. + +## Type definitions + +```ts +type WorkflowContext = { + agent(prompt: string, opts?: AgentOpts): Promise; + parallel(thunks: Array<() => Promise>): Promise>; + pipeline(items: unknown[], ...stages: StageFn[]): Promise; + phase(title: string): void; + log(message: string): void; + args: unknown; + budget: WorkflowBudgetView; + workflow( + ref: string | { scriptPath?: string; source?: string }, + args?: unknown, + ): Promise; +}; + +type AgentOpts = { + label?: string; + phase?: string; + schema?: Record; + model?: string; + isolation?: 'worktree'; + agentType?: string; +}; + +type StageFn = ( + prev: unknown, + originalItem: unknown, + index: number, +) => unknown | Promise; + +type WorkflowBudgetView = { + total: number | null; + spent: () => number; + remaining: () => number; +}; +``` + +--- + +## `agent(prompt, opts?)` + +Spawns a subagent — a full Flint `agent()` loop with an isolated work directory, tools from the resolved `agentType`, and optional structured output. + +```ts +// No schema: returns the agent's final assistant text (string) +const summary = await wf.agent('Summarize the codebase in three sentences'); + +// With schema: agent is forced to call structured_output; returns validated object +const findings = await wf.agent('Review src/auth.ts for security issues', { + schema: { + type: 'object', + properties: { + issues: { type: 'array', items: { type: 'string' } }, + severity: { type: 'string', enum: ['low', 'medium', 'high'] }, + }, + required: ['issues', 'severity'], + }, +}); +// findings is { issues: string[], severity: string } +``` + +### AgentOpts + +| Field | Type | Description | +|-------|------|-------------| +| `label` | `string` | Display label for this agent in events and logs. Defaults to the first 48 characters of the prompt. | +| `phase` | `string` | Override the progress phase for this agent (avoids races in `parallel`/`pipeline`; the active `wf.phase()` is used otherwise). | +| `schema` | `Record` | JSON Schema for structured output. The agent is forced to call a `structured_output` tool and the validated result is returned. Retries on schema mismatch. | +| `model` | `string` | Per-agent model override. Wins over the `agentType` preset model and `config.models.default`. | +| `isolation` | `'worktree'` | Select the git-worktree isolation backend for this agent (requires `worktreeRepoDir` in `RuntimeConfig`; falls back to `workdirIsolation` outside a git repo). | +| `agentType` | `string` | Resolve a preset from the `AgentTypeRegistry`. Built-ins: `'default'`, `'Explore'`, `'code-reviewer'`. Composes with `schema`. | + +### Structured output retry behavior + +When `schema` is set, the runtime appends a forced `structured_output` tool to the agent's tool list. The handler validates the call with ajv. On mismatch, it returns an error message telling the agent which fields are wrong so the agent retries. Only the first valid call is captured; subsequent calls are ignored. A hard `WorkflowError` is thrown if the agent finishes without calling `structured_output` at all. + +--- + +## `parallel(thunks)` + +Runs all thunks concurrently and **waits for all of them** (barrier). A thunk that throws resolves to `null` — `parallel` never rejects. + +```ts +const files = ['src/a.ts', 'src/b.ts', 'src/c.ts']; + +// All three start simultaneously; parallel waits for all three +const results = await wf.parallel( + files.map((f) => () => wf.agent(`Summarize ${f}`)), +); +// results: Array — null where an agent threw + +const successes = results.filter((r): r is string => r !== null); +``` + +Use `parallel` when you need a barrier — stage N genuinely needs all of stage N-1's results before proceeding (for dedup, merging, or early-exit on zero results). For independent multi-stage work use `pipeline` instead. + +--- + +## `pipeline(items, ...stages)` + +Processes each item through every stage independently with **no barrier between stages**. Wall-clock time equals the slowest single-item chain, not the sum of all stages. + +Each stage receives `(prevResult, originalItem, index)`. A throwing stage sets that item's result to `null` and skips its remaining stages. The returned array is aligned to `items`. + +```ts +const files = ['src/a.ts', 'src/b.ts', 'src/c.ts']; + +// Stage 1: summarize; Stage 2: grade the summary — each file flows independently +const graded = await wf.pipeline( + files, + // stage 1: returns summary string + (_, file) => wf.agent(`Summarize ${file as string}`), + // stage 2: receives summary from stage 1, grades it + (summary, file) => + wf.agent(`Grade this summary of ${file as string}: "${summary as string}"`, { + schema: { + type: 'object', + properties: { score: { type: 'number' }, comment: { type: 'string' } }, + required: ['score', 'comment'], + }, + }), +); +// graded: Array<{ score: number; comment: string } | null> +``` + +`pipeline` is the default choice for multi-stage work. Only reach for `parallel` when you need a cross-item barrier. + +--- + +## `phase(title)` + +Marks the start of a named progress group. Fires a `phase_started` event and updates the ambient phase for subsequent `agent()` calls. + +```ts +wf.phase('Research'); +// ... agents here get phase: 'Research' in their events + +wf.phase('Write'); +// ... agents here get phase: 'Write' +``` + +--- + +## `log(message)` + +Emits a `log` event — a human-readable narrator line for progress reporting. + +```ts +wf.log(`Processing ${files.length} files`); +wf.log('All findings verified'); +``` + +--- + +## `args` + +The value passed as `RuntimeConfig.args`. Available verbatim in both string scripts and typed workflows. No type assumption — cast to the expected shape: + +```ts +// Typed workflow +const files = wf.args as string[]; + +// String script +const files = args; // injected as a global +``` + +--- + +## `budget` + +Exposes the run's token usage against the optional `tokenTarget` ceiling: + +```ts +type WorkflowBudgetView = { + total: number | null; // tokenTarget, or null when unset + spent: () => number; // output tokens used so far this run + remaining: () => number; // max(0, total - spent()) or Infinity when total is null +}; +``` + +Use `budget` for adaptive loops: + +```ts +while (wf.budget.total !== null && wf.budget.remaining() > 50_000) { + const round = await wf.agent('Find more issues'); + if (!round) break; + // process round... +} +``` + +Or to size a fleet proportionally: + +```ts +const agentCount = wf.budget.total !== null + ? Math.floor(wf.budget.total / 100_000) + : 5; +``` + +--- + +## `workflow(ref, args?)` + +Runs another workflow inline, sharing the current run's concurrency cap, agent counter, budget, signal, and journal. One nesting level only — calling `workflow()` inside a child workflow throws `WorkflowError('workflow() nesting is one level only')`. + +```ts +// Named workflow from the registry (pass createWorkflowRegistry to RuntimeConfig.workflows) +const subResult = await wf.workflow('analyze', { path: 'src/' }); + +// Inline source string +const subResult2 = await wf.workflow({ + source: ` + export const meta = { name: 'quick-check', description: 'Quick check' } + return await agent('Quick check: ' + args) + `, + args: 'src/index.ts', +}); +``` + +--- + +## See also + +- [Workflow Runtime](/landlord/workflow) — `RuntimeConfig`, `WorkflowEvent`, quick start +- [Resume and journaling](/landlord/resume) — replaying unchanged `agent()` calls +- [Agent Types](/landlord/agent-types) — `agentType` presets and custom types +- [Isolation](/landlord/isolation) — work directory isolation backends diff --git a/docs/landlord/index.md b/docs/landlord/index.md index 1dfeaa8..1c65395 100644 --- a/docs/landlord/index.md +++ b/docs/landlord/index.md @@ -73,6 +73,49 @@ if (result.ok) { | **Eviction** | When a tenant fails a checkpoint or runs out of budget — triggers retry | | **Escalation** | When a tenant exhausts all retries — its dependents are cancelled | +## Two ways to orchestrate + +### Script-driven workflow runtime (new) + +Write TypeScript that drives subagents directly using hooks (`agent`, `parallel`, `pipeline`, `phase`, `log`, `args`, `budget`, `workflow`). You control the control flow — loops, conditionals, fan-out are all plain code. + +```ts +import { defineWorkflow, runWorkflow } from '@flint/landlord'; + +const workflow = defineWorkflow({ + meta: { name: 'review', description: 'Review and verify findings' }, + run: async (wf) => { + wf.phase('Scan'); + const findings = await wf.parallel( + (wf.args as string[]).map((f) => () => + wf.agent(`Scan ${f} for issues`, { agentType: 'code-reviewer', schema: FINDINGS_SCHEMA }), + ), + ); + return findings.filter(Boolean); + }, +}); + +const result = await runWorkflow(workflow, { adapter, models: { default: 'claude-opus-4-7' } }); +``` + +The model can also write the workflow as a string script via `runWorkflowScript` or the `workflowTool`. See [Workflow Runtime](/landlord/workflow) for the full API. + +### Auto-decompose (`orchestrate()`) + +Describe the goal in a prompt; the orchestrator asks an LLM to decompose it into a `Contract[]` (a DAG of worker specs) and then runs all tenants in parallel where dependencies allow. Useful when the decomposition strategy itself should be model-driven. + +```ts +import { orchestrate } from '@flint/landlord'; + +const result = await orchestrate( + 'Build a REST API for a todo app with CRUD endpoints and SQLite storage', + (workDir) => standardTools(workDir), + { adapter, landlordModel: 'claude-opus-4-7', tenantModel: 'claude-opus-4-7' } +); +``` + +`orchestrate()` is now built on the workflow runtime internally — it shares the same concurrency cap, journaling, and event system. Its public API and behavior are unchanged. + ## When to use landlord vs agent() Use `agent()` when a single model can accomplish the goal in one continuous loop. Use landlord when: @@ -84,6 +127,8 @@ Use `agent()` when a single model can accomplish the goal in one continuous loop ## See also +- [Workflow Runtime](/landlord/workflow) — script-driven multi-agent orchestration +- [Hooks reference](/landlord/hooks) — `agent`, `parallel`, `pipeline`, `phase`, `log`, `args`, `budget`, `workflow` - [Contracts](/landlord/contract) — Contract and Checkpoint schemas - [decompose()](/landlord/decompose) — how goals become contract lists - [orchestrate()](/landlord/orchestrate) — full orchestration API diff --git a/docs/landlord/isolation.md b/docs/landlord/isolation.md new file mode 100644 index 0000000..7aa8637 --- /dev/null +++ b/docs/landlord/isolation.md @@ -0,0 +1,120 @@ +# Isolation + +Every `agent()` call in a workflow gets an isolated work directory. This keeps agents from accidentally reading each other's partial outputs or writing to shared paths. The `IsolationBackend` interface controls how that directory is provisioned and cleaned up. + +## Type definitions + +```ts +type IsolationLease = { + workDir: string; + release: () => Promise; +}; + +interface IsolationBackend { + acquire(label: string): Promise; +} +``` + +`acquire` receives the agent's label (sanitized to alphanumeric + `_-`, max 40 characters) and returns a lease containing the path to use. `release` is called after the agent loop finishes — whether it succeeded or threw. + +## `workdirIsolation(baseDir)` — default + +Creates a fresh subdirectory under `baseDir` for each agent. Directory names are `-`. `release` is a no-op (the directory is kept for inspection after the run). + +This is the default backend used by all agents unless overridden. + +```ts +import { runWorkflow, workdirIsolation } from '@flint/landlord'; +import { join } from 'node:path'; + +// Explicit — use a specific base directory +const result = await runWorkflow(workflow, { + adapter, + models: { default: 'claude-opus-4-7' }, + isolation: workdirIsolation(join(process.cwd(), 'agent-workdirs')), +}); +``` + +If you don't pass `isolation`, the runtime creates a `workdirIsolation` in `os.tmpdir()/flint-workflow-/` automatically. + +## `gitWorktreeIsolation(repoDir, baseDir)` — optional + +Creates a git worktree per agent via `git worktree add --detach`. Each agent gets a clean checkout of `HEAD` to work in. `release` runs `git worktree remove --force` after the agent finishes. + +**Requires** a git repository at `repoDir`. Outside a git repo, `gitWorktreeIsolation` falls back silently to `workdirIsolation(baseDir)`. + +Enable it for all agents by setting `worktreeRepoDir` in `RuntimeConfig`, or per-agent with `isolation: 'worktree'` in `AgentOpts`: + +```ts +import { defineWorkflow, runWorkflow } from '@flint/landlord'; +import { anthropicAdapter } from '@flint/adapter-anthropic'; + +const adapter = anthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }); + +const workflow = defineWorkflow({ + meta: { name: 'parallel-edits', description: 'Edit files in parallel worktrees' }, + run: async (wf) => { + // These two agents each get their own git worktree + const [resultA, resultB] = await wf.parallel([ + () => wf.agent('Refactor src/auth.ts to use async/await throughout', { + isolation: 'worktree', + label: 'refactor-auth', + }), + () => wf.agent('Add JSDoc comments to src/api.ts', { + isolation: 'worktree', + label: 'jsdoc-api', + }), + ]); + return { resultA, resultB }; + }, +}); + +const result = await runWorkflow(workflow, { + adapter, + models: { default: 'claude-opus-4-7' }, + // Enable the worktree backend for agents that use isolation: 'worktree' + worktreeRepoDir: process.cwd(), +}); +``` + +## When to use `isolation: 'worktree'` + +Use a git worktree when: + +- Agents will modify files and you want each agent to start from a clean copy of `HEAD`. +- You need to diff or merge the agent's changes after it finishes. +- Agents running in parallel would otherwise conflict on the same files. + +Use the default `workdirIsolation` when: + +- Agents are read-only (search, analysis, code review). +- You want agents to read from the real working tree without copying it. +- You are not in a git repository. + +## Bringing a custom backend + +Any object with an `acquire(label)` method that returns `{ workDir, release }` works: + +```ts +import type { IsolationBackend } from '@flint/landlord'; + +// Example: always use /tmp/shared-workdir (single shared dir — not recommended for parallel agents) +const sharedDirBackend: IsolationBackend = { + acquire: async () => ({ + workDir: '/tmp/shared-workdir', + release: async () => {}, + }), +}; + +await runWorkflow(workflow, { + adapter, + models: { default: 'claude-opus-4-7' }, + isolation: sharedDirBackend, +}); +``` + +## See also + +- [Workflow Runtime](/landlord/workflow) — `RuntimeConfig.isolation`, `RuntimeConfig.worktreeRepoDir` +- [Hooks reference](/landlord/hooks) — `AgentOpts.isolation` +- [Agent Types](/landlord/agent-types) — tool sets that use `workDir` diff --git a/docs/landlord/orchestrate.md b/docs/landlord/orchestrate.md index 2e94a68..421ff63 100644 --- a/docs/landlord/orchestrate.md +++ b/docs/landlord/orchestrate.md @@ -2,6 +2,10 @@ `orchestrate()` runs the complete landlord pipeline: decompose a goal into contracts, sort by dependency, run all tenants in parallel (where dependencies allow), collect artifacts, and return the result. +::: info Built on the workflow runtime +`orchestrate()` is now implemented as a built-in workflow on the [workflow runtime](/landlord/workflow). The public API, event names, and result shape are **unchanged** — existing code continues to work without modification. The rebuild means `orchestrate()` jobs now share the same concurrency cap, run ID, and `WorkflowEvent` infrastructure as script-driven workflows. +::: + ## Signature ```ts diff --git a/docs/landlord/resume.md b/docs/landlord/resume.md new file mode 100644 index 0000000..bce1490 --- /dev/null +++ b/docs/landlord/resume.md @@ -0,0 +1,163 @@ +# Resume and Journaling + +The workflow runtime journals every `agent()` call. If a run crashes or times out, you can restart it with `resumeFromRunId` and it will replay the unchanged prefix from the journal — skipping all the model calls that already succeeded. + +## How journaling works + +Every `agent()` call: + +1. Computes an index (monotonically incremented per run) and a hash of `{ prompt, opts }`. +2. Checks the loaded resume entries for an entry with the same `index` and `hash`. +3. If found — returns the cached result immediately (no model call, no slot acquired). +4. If not found — runs live, then appends `{ index, hash, result }` to the journal. + +The first divergence (different prompt/opts or new index) runs live. Everything after it runs live too. Same script + same args guarantees a 100% cache hit on resume. + +## JournalStore interface + +```ts +interface JournalStore { + append(runId: string, entry: JournalEntry): Promise; + load(runId: string): Promise; +} + +type JournalEntry = { + index: number; // monotonic call counter + hash: string; // FNV-1a of stableStringify({ prompt, opts }) + result: unknown; // the captured return value +}; +``` + +## `memoryJournalStore()` + +The default. Stores entries in memory — not persistent across process restarts. Useful for testing and short-lived runs. + +```ts +import { memoryJournalStore, runWorkflowScript } from '@flint/landlord'; + +const journal = memoryJournalStore(); +const source = ` +export const meta = { name: 'review', description: 'Review files' } +const result = await agent('Review src/auth.ts for issues') +return result +`; + +const r1 = await runWorkflowScript(source, { + adapter, + models: { default: 'claude-opus-4-7' }, + journal, + runId: 'run-001', +}); +// r1.ok === true, r1.value.result === 'some review text' + +// Simulate a resume in the same process (e.g. after a downstream failure): +const r2 = await runWorkflowScript(source, { + adapter: throwingAdapter, // never called — replay hits cache + models: { default: 'claude-opus-4-7' }, + journal, + runId: 'run-002', + resumeFromRunId: 'run-001', +}); +// r2.ok === true, r2.value.result === 'some review text' (replayed) +``` + +## `fileJournalStore(dir)` + +Persists entries as JSONL files on disk — survives process restarts. Each run gets its own file: `journal-.jsonl`. + +```ts +import { fileJournalStore, runWorkflowScript } from '@flint/landlord'; +import { join } from 'node:path'; + +const journal = fileJournalStore(join(process.cwd(), '.workflow-journal')); + +// First run: calls the model and writes to disk +const r1 = await runWorkflowScript(source, { + adapter, + models: { default: 'claude-opus-4-7' }, + journal, + runId: 'run-001', +}); + +// Later (even in a fresh process): resume from disk +const r2 = await runWorkflowScript(source, { + adapter, + models: { default: 'claude-opus-4-7' }, + journal, + runId: 'run-002', + resumeFromRunId: 'run-001', +}); +``` + +## Full resume example + +A realistic pattern: run once, crash halfway, resume without repeating the expensive calls. + +```ts +import { + defineWorkflow, + fileJournalStore, + runWorkflow, +} from '@flint/landlord'; +import { anthropicAdapter } from '@flint/adapter-anthropic'; +import { join } from 'node:path'; + +const adapter = anthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }); +const journal = fileJournalStore(join(process.cwd(), '.workflow-journal')); + +const auditWorkflow = defineWorkflow({ + meta: { name: 'audit', description: 'Audit a codebase' }, + run: async (wf) => { + wf.phase('Scan'); + const scan = await wf.agent('Scan the codebase for TODOs and FIXMEs'); + + wf.phase('Analyze'); + // If this agent fails and the run crashes, the 'Scan' agent above is journaled + const analysis = await wf.agent('Analyze these items and prioritize: ' + String(scan)); + + wf.phase('Report'); + const report = await wf.agent('Write a summary report of: ' + String(analysis)); + + return { scan, analysis, report }; + }, +}); + +async function runWithResume(runId: string, resumeFromRunId?: string) { + return runWorkflow(auditWorkflow, { + adapter, + models: { default: 'claude-opus-4-7' }, + journal, + runId, + resumeFromRunId, + }); +} + +// First attempt +const r1 = await runWithResume('audit-2026-05-31-a'); +if (!r1.ok) { + console.error('Run failed:', r1.error.message); + // Resume from the partial run — 'Scan' will be replayed, 'Analyze' re-runs + const r2 = await runWithResume('audit-2026-05-31-b', 'audit-2026-05-31-a'); + if (r2.ok) console.log('Resumed result:', r2.value.result); +} +``` + +## Determinism requirement + +Resume works by replaying a hash match: `hash(prompt, opts)` at the same call index. This requires the workflow to produce the same call sequence on restart. + +**String scripts:** The sandbox blocks `Date.now()`, `new Date()`, and `Math.random()` (they throw). This enforces determinism automatically. Pass timestamps or seeds via `args`. + +**Typed workflows:** The sandbox cannot intercept lexical globals. You are responsible for avoiding nondeterminism in the `run` function: + +- Do not call `Date.now()` or `Math.random()` inside `run()`. +- Do not vary calls based on external state that could change between runs. +- Pass any variable inputs through `wf.args`. + +If the call sequence diverges from the journal, the divergence point and all subsequent calls run live — the partial cache is still useful; you just lose hits after the divergence. + +## See also + +- [Workflow Runtime](/landlord/workflow) — `RuntimeConfig` fields `runId`, `resumeFromRunId`, `journal` +- [Hooks reference](/landlord/hooks) — `agent()` call semantics +- [Dynamic Workflow Example](/examples/dynamic-workflow) — end-to-end example with journaling diff --git a/docs/landlord/workflow-tool.md b/docs/landlord/workflow-tool.md new file mode 100644 index 0000000..8a5027c --- /dev/null +++ b/docs/landlord/workflow-tool.md @@ -0,0 +1,194 @@ +# Workflow Tool + +`workflowTool` wraps the workflow runtime as a Flint `Tool`. Drop it into any `agent()` call and that agent can author workflow scripts and run them — giving the model the same orchestration capabilities described in the rest of this section. + +## `workflowTool(config)` + +```ts +import { workflowTool } from '@flint/landlord'; +import { anthropicAdapter } from '@flint/adapter-anthropic'; + +const adapter = anthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }); + +const wfTool = workflowTool({ + adapter, + models: { default: 'claude-opus-4-7' }, +}); +``` + +### WorkflowToolConfig + +```ts +type WorkflowToolConfig = { + adapter: ProviderAdapter; + models: Models; + registry?: AgentTypeRegistry; // custom agent types + workflows?: WorkflowRegistry; // named registered workflows + journal?: JournalStore; // journal backend (default: memoryJournalStore) + isolation?: IsolationBackend; // isolation backend (default: workdirIsolation) + onEvent?: EventSink; // progress callback +}; +``` + +The tool is named `workflow` and accepts: + +| Input field | Type | Description | +|-------------|------|-------------| +| `script` | `string` | A workflow JS script starting with `export const meta = { ... }` | +| `args` | `unknown` | Value exposed to the script as `args` | +| `name` | `string` | Name of a registered workflow to run instead of `script` | +| `resumeFromRunId` | `string` | Resume a prior run, replaying unchanged agents | + +The handler returns a JSON string `{ runId: string, result: unknown }`. On error it returns a string starting with `Error:` so the calling agent can see the message and retry or escalate. + +## Using the tool in an agent + +```ts +import { agent } from 'flint'; +import { budget } from 'flint/budget'; +import { workflowTool, WORKFLOW_TOOL_GUIDE } from '@flint/landlord'; + +const wfTool = workflowTool({ adapter, models: { default: 'claude-opus-4-7' } }); + +const result = await agent({ + adapter, + model: 'claude-opus-4-7', + messages: [ + { role: 'system', content: WORKFLOW_TOOL_GUIDE }, + { role: 'user', content: 'Audit the src/ directory for security vulnerabilities. ' + + 'Use a multi-agent pipeline: first scan all files, then verify each finding independently.' }, + ], + tools: [wfTool], + budget: budget({ maxSteps: 20, maxDollars: 5.00 }), +}); + +if (result.ok) { + console.log('Agent response:', result.value.message.content); +} +``` + +## `WORKFLOW_TOOL_GUIDE` + +A system-prompt string that teaches the model how to author effective workflows. It covers: + +- The meta block syntax and all available hooks +- The barrier-vs-no-barrier distinction (`parallel` vs `pipeline`) +- When to use `schema` for structured output +- The concurrency and agent cap +- Quality patterns: adversarial verify, judge panel, loop-until-dry, multi-modal sweep, completeness critic + +Paste it into any system prompt where you want the model to reason about multi-agent orchestration. The `orchestratorAgent` helper does this automatically. + +## `orchestratorAgent(config)` + +A convenience wrapper that pre-wires `workflowTool` and `WORKFLOW_TOOL_GUIDE` into a callable agent function: + +```ts +import { orchestratorAgent } from '@flint/landlord'; +import { budget } from 'flint/budget'; + +const orchestrate = orchestratorAgent({ + adapter, + models: { default: 'claude-opus-4-7' }, +}); + +// Call it like a regular agent +const result = await orchestrate( + 'Build a comprehensive review of the authentication system. ' + + 'Cover: implementation quality, security posture, and test coverage.', + { budget: budget({ maxDollars: 10.00, maxSteps: 50 }) }, +); + +if (result.ok) { + console.log('Review:', result.value.message.content); +} +``` + +`orchestratorAgent` returns a function with the signature: + +```ts +(prompt: string, opts?: { budget?: Budget; model?: string }) => ReturnType +``` + +## Full example: tool + event logging + +```ts +import { agent } from 'flint'; +import { budget } from 'flint/budget'; +import { + fileJournalStore, + workflowTool, + WORKFLOW_TOOL_GUIDE, +} from '@flint/landlord'; +import { anthropicAdapter } from '@flint/adapter-anthropic'; +import { join } from 'node:path'; + +const adapter = anthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }); +const journal = fileJournalStore(join(process.cwd(), '.workflow-journal')); + +const wfTool = workflowTool({ + adapter, + models: { default: 'claude-opus-4-7' }, + journal, + onEvent: (e) => { + if (e.type === 'phase_started') console.log(`\n[${e.title}]`); + if (e.type === 'agent_started') console.log(` → ${e.label} (${e.model})`); + if (e.type === 'agent_complete') console.log(` ✓ ${e.label} (${e.tokens} tokens)`); + if (e.type === 'workflow_complete') console.log('\n[Done]'); + }, +}); + +const result = await agent({ + adapter, + model: 'claude-opus-4-7', + messages: [ + { role: 'system', content: WORKFLOW_TOOL_GUIDE }, + { + role: 'user', + content: + 'Review src/ for security issues using parallel scanners, ' + + 'then verify each finding with an independent agent.', + }, + ], + tools: [wfTool], + budget: budget({ maxSteps: 30, maxDollars: 8.00 }), +}); + +if (!result.ok) { + console.error('Agent failed:', result.error.message); +} else { + console.log('\nAgent response:\n', result.value.message.content); +} +``` + +## Registered workflows + +Pass a `WorkflowRegistry` to `workflowTool` to let the model (or agent) run named workflows by name instead of writing a script every time: + +```ts +import { createWorkflowRegistry, workflowTool } from '@flint/landlord'; + +const source = ` +export const meta = { name: 'security-scan', description: 'Run a security scan' } +const targets = args ?? ['src/'] +const results = await parallel(targets.map(t => () => agent('Scan ' + t + ' for vulnerabilities'))) +return results.filter(Boolean) +`; + +const workflows = createWorkflowRegistry({ 'security-scan': source }); + +const wfTool = workflowTool({ + adapter, + models: { default: 'claude-opus-4-7' }, + workflows, +}); + +// The model can now call the tool with { name: 'security-scan', args: ['src/', 'tests/'] } +``` + +## See also + +- [Workflow Runtime](/landlord/workflow) — `runWorkflow`, `runWorkflowScript`, `RuntimeConfig` +- [Hooks reference](/landlord/hooks) — the full hook API the model has access to +- [Resume and journaling](/landlord/resume) — `resumeFromRunId` for model-authored workflow recovery +- [Dynamic Workflow Example](/examples/dynamic-workflow) — complete end-to-end example diff --git a/docs/landlord/workflow.md b/docs/landlord/workflow.md new file mode 100644 index 0000000..8f6dfe9 --- /dev/null +++ b/docs/landlord/workflow.md @@ -0,0 +1,201 @@ +# Workflow Runtime + +The workflow runtime is the core of `@flint/landlord`. It lets you drive multiple subagents with plain TypeScript — writing real control flow (loops, conditionals, fan-out) rather than declaring a static DAG. + +## Mental model + +A workflow is a function (or a JS string) that receives a set of hooks and calls them to orchestrate work: + +``` +runWorkflowScript(source, config) + └── compileScript(source) ← parse meta, strip exports, wrap in AsyncFunction + └── executeModule(module, deps) + └── buildContext(deps) ← inject agent/parallel/pipeline/phase/log/args/budget/workflow + └── module.run(wf) + ├── wf.phase('Find') + ├── wf.parallel([...]) ← concurrent with barrier + ├── wf.pipeline(items, ...) ← no barrier between stages + └── wf.agent(prompt, opts) ← spawns one subagent +``` + +Every `agent()` call runs a full Flint `agent()` loop in an isolated work directory. Concurrency is capped automatically; a per-run agent counter prevents runaway workflows. + +## Two authoring paths + +### String script (`runWorkflowScript`) + +The model writes a workflow as a plain JS string. The runtime parses a `meta` block, sandboxes nondeterminism, and injects hooks as globals: + +```ts +import { runWorkflowScript } from '@flint/landlord'; +import { anthropicAdapter } from '@flint/adapter-anthropic'; + +const adapter = anthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }); + +const source = ` +export const meta = { name: 'review', description: 'Review and verify findings' } + +phase('Find') +const findings = await parallel(files.map(f => () => agent( + 'Review ' + f + ' for security issues', + { schema: { type: 'object', properties: { issues: { type: 'array', items: { type: 'string' } } }, required: ['issues'] } } +))) + +phase('Verify') +const verified = await pipeline( + findings.filter(Boolean), + (f) => agent('Verify this finding — is it a real vulnerability? ' + JSON.stringify(f), + { schema: { type: 'object', properties: { confirmed: { type: 'boolean' }, reason: { type: 'string' } }, required: ['confirmed', 'reason'] } }) +) + +return verified.filter(v => v?.confirmed) +`; + +const files = ['src/auth.ts', 'src/api.ts']; +const result = await runWorkflowScript(source, { + adapter, + models: { default: 'claude-opus-4-7' }, + args: files, + onEvent: (e) => { + if (e.type === 'phase_started') console.log(`\n=== ${e.title} ===`); + if (e.type === 'agent_complete') console.log(` done: ${e.label} (${e.tokens} tokens)`); + }, +}); + +if (result.ok) { + console.log('Confirmed vulnerabilities:', result.value.result); +} +``` + +Note that `args` is passed to the script as a global `args` value; here the files array is available in the script as `files` only because the example uses `args` directly — update the script to use `const files = args` or pass it via the `args` field. + +### Typed workflow (`defineWorkflow`) + +For production code where you want type checking: + +```ts +import { defineWorkflow, runWorkflow } from '@flint/landlord'; +import { anthropicAdapter } from '@flint/adapter-anthropic'; + +const adapter = anthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }); + +const reviewWorkflow = defineWorkflow({ + meta: { + name: 'review', + description: 'Review and verify findings', + phases: [ + { title: 'Find', detail: 'Scan files for issues' }, + { title: 'Verify', detail: 'Confirm each finding is real' }, + ], + }, + run: async (wf) => { + const files = wf.args as string[]; + + wf.phase('Find'); + const findings = await wf.parallel( + files.map((f) => () => + wf.agent(`Review ${f} for security issues`, { + schema: { + type: 'object', + properties: { issues: { type: 'array', items: { type: 'string' } } }, + required: ['issues'], + }, + }), + ), + ); + + wf.phase('Verify'); + const verified = await wf.pipeline( + findings.filter(Boolean), + (finding) => + wf.agent(`Verify this finding — is it a real vulnerability? ${JSON.stringify(finding)}`, { + schema: { + type: 'object', + properties: { + confirmed: { type: 'boolean' }, + reason: { type: 'string' }, + }, + required: ['confirmed', 'reason'], + }, + }), + ); + + return (verified as Array<{ confirmed: boolean } | null>).filter((v) => v?.confirmed); + }, +}); + +const result = await runWorkflow(reviewWorkflow, { + adapter, + models: { default: 'claude-opus-4-7' }, + args: ['src/auth.ts', 'src/api.ts'], +}); + +if (result.ok) { + console.log('runId:', result.value.runId); + console.log('Result:', result.value.result); +} +``` + +## RuntimeConfig + +Both `runWorkflowScript` and `runWorkflow` accept the same `RuntimeConfig`: + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `adapter` | `ProviderAdapter` | required | Flint provider adapter for all model calls | +| `models` | `{ default: string; [tier: string]: string }` | required | Model tier map. `models.default` is used unless overridden per agent | +| `args` | `unknown` | `undefined` | Value passed into the workflow as `wf.args` / the `args` global | +| `budget` | `Budget` | unlimited steps | Shared Flint `Budget` across all agent calls | +| `tokenTarget` | `number \| null` | `null` | Optional output-token ceiling. Agents fail with `WorkflowError` after this many output tokens | +| `registry` | `AgentTypeRegistry` | built-ins | Custom agent-type registry (merged over built-ins if using `createAgentRegistry`) | +| `workflows` | `WorkflowRegistry` | none | Named workflow registry for `workflow(name)` calls | +| `journal` | `JournalStore` | `memoryJournalStore()` | Journal backend for resume. Use `fileJournalStore(dir)` for persistence across processes | +| `isolation` | `IsolationBackend` | `workdirIsolation(baseDir)` | Default isolation backend for all agents | +| `worktreeRepoDir` | `string` | none | Enables `gitWorktreeIsolation` for agents that pass `isolation: 'worktree'` | +| `baseDir` | `string` | `os.tmpdir()/flint-workflow-` | Base directory for isolated work dirs | +| `concurrency` | `number` | `max(1, min(16, cpus-2))` | Semaphore limit — max agents running simultaneously | +| `agentCap` | `number` | `1000` | Lifetime agent counter ceiling per run | +| `onEvent` | `(e: WorkflowEvent) => void` | none | Progress callback for all workflow events | +| `signal` | `AbortSignal` | none | Cancels in-flight agents and skips queued ones | +| `runId` | `string` | random UUID slice | ID for the current run (used as journal key) | +| `resumeFromRunId` | `string` | none | Load the journal for this prior `runId` and replay the unchanged prefix | + +## WorkflowEvent catalog + +```ts +type WorkflowEvent = + | { type: 'phase_started'; title: string } + | { type: 'log'; message: string } + | { type: 'agent_started'; label: string; phase?: string; agentType: string; model: string } + | { type: 'agent_complete'; label: string; phase?: string; tokens: number } + | { type: 'agent_error'; label: string; phase?: string; error: string } + | { type: 'workflow_complete'; result: unknown }; +``` + +- `phase_started` — fired when `wf.phase(title)` is called +- `log` — fired when `wf.log(message)` is called +- `agent_started` — fired before each agent loop starts (includes model and agentType) +- `agent_complete` — fired when an agent loop finishes successfully (includes total tokens used) +- `agent_error` — fired when an agent loop throws; the error propagates unless wrapped in `parallel`/`pipeline` +- `workflow_complete` — fired after `run()` returns, carries the return value + +## Comparison: which orchestration primitive? + +| | `runWorkflow` / `runWorkflowScript` | `orchestrate()` | `@flint/graph` | `agent()` | +|-|-------------------------------------|-----------------|----------------|-----------| +| **Authoring** | Code (imperative loops, fan-out) | Prompt (LLM decomposes) | State-machine nodes | Single loop | +| **Control flow** | You write it | Auto-generated DAG | Explicit transitions | Tool calls | +| **Structured output** | `schema` per agent | Checkpoints per tenant | Node output types | Tools | +| **Resume** | Yes — journaling | No | Yes — checkpoints | No | +| **Best for** | Scripted multi-phase pipelines | Open-ended goals | Stateful multi-turn | Single-agent tasks | + +`orchestrate()` itself is now a built-in workflow on this runtime. The APIs are independent; pick the one that matches how much control you want. + +## See also + +- [Hooks reference](/landlord/hooks) — full `agent`, `parallel`, `pipeline`, `phase`, `log`, `args`, `budget`, `workflow` API +- [Resume and journaling](/landlord/resume) — how to resume a crashed run +- [Agent Types](/landlord/agent-types) — built-in presets and custom agent types +- [Isolation](/landlord/isolation) — per-agent work directories and git-worktree backends +- [Workflow Tool](/landlord/workflow-tool) — give a model the ability to write and run workflows +- [Dynamic Workflow Example](/examples/dynamic-workflow) — a complete review pipeline, both string and typed From 63bb0fd7727ee60cee81a6c556b45fd74f544544 Mon Sep 17 00:00:00 2001 From: DizzyMii Date: Sun, 31 May 2026 17:30:08 -0600 Subject: [PATCH 22/22] docs(landlord): fix two workflow doc samples to use the real API - workflow.md: bind `const files = args` in the script (files was undefined) - hooks.md: pass workflow() args as the second argument, not a ref field Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/landlord/hooks.md | 18 ++++++++++-------- docs/landlord/workflow.md | 3 ++- 2 files changed, 12 insertions(+), 9 deletions(-) diff --git a/docs/landlord/hooks.md b/docs/landlord/hooks.md index 178c5b1..394e2d8 100644 --- a/docs/landlord/hooks.md +++ b/docs/landlord/hooks.md @@ -212,14 +212,16 @@ Runs another workflow inline, sharing the current run's concurrency cap, agent c // Named workflow from the registry (pass createWorkflowRegistry to RuntimeConfig.workflows) const subResult = await wf.workflow('analyze', { path: 'src/' }); -// Inline source string -const subResult2 = await wf.workflow({ - source: ` - export const meta = { name: 'quick-check', description: 'Quick check' } - return await agent('Quick check: ' + args) - `, - args: 'src/index.ts', -}); +// Inline source string — args is the SECOND argument, not a field of the ref object +const subResult2 = await wf.workflow( + { + source: ` + export const meta = { name: 'quick-check', description: 'Quick check' } + return await agent('Quick check: ' + args) + `, + }, + 'src/index.ts', +); ``` --- diff --git a/docs/landlord/workflow.md b/docs/landlord/workflow.md index 8f6dfe9..83d50d1 100644 --- a/docs/landlord/workflow.md +++ b/docs/landlord/workflow.md @@ -35,6 +35,7 @@ const adapter = anthropicAdapter({ apiKey: process.env.ANTHROPIC_API_KEY! }); const source = ` export const meta = { name: 'review', description: 'Review and verify findings' } +const files = args phase('Find') const findings = await parallel(files.map(f => () => agent( 'Review ' + f + ' for security issues', @@ -67,7 +68,7 @@ if (result.ok) { } ``` -Note that `args` is passed to the script as a global `args` value; here the files array is available in the script as `files` only because the example uses `args` directly — update the script to use `const files = args` or pass it via the `args` field. +The value passed as `args` is exposed to the script as the global `args`. Here the host passes the file list via `args: files`, and the script binds it locally with `const files = args` before using it. ### Typed workflow (`defineWorkflow`)