feat(gateway): progressive-disclosure recipes (opencode skills pattern, renamed)#484
feat(gateway): progressive-disclosure recipes (opencode skills pattern, renamed)#484raahulrahl merged 8 commits intomainfrom
Conversation
The agent loader was importing splitFrontmatter from the skill module and duplicating parseYaml/parseScalar locally. The skill module is scheduled for rename to "recipe" (progressive-disclosure playbooks, opencode pattern), so the shared pieces move to a util both loaders can depend on without the agent module reaching into the skill namespace. No behavior change. agent/index.ts shrinks by 51 lines; same tests pass (154/154). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Recipes are markdown playbooks the planner can lazy-load when a task
matches — metadata (name + description) sits in the system prompt, the
body only loads when the planner calls load_recipe. Pattern borrowed from
opencode's skill module, renamed because "skill" is already taken in the
gateway for A2A SkillRequest (agent capabilities exposed via /plan).
Supports two layouts: flat recipes/foo.md and bundled
recipes/bar/RECIPE.md with sibling scripts/, reference/, etc. files the
tool (Phase 4) will surface to the planner.
Duplicate names throw at load time — silent precedence would make it
ambiguous which body loads. Permission filtering via the existing
Ruleset evaluator; default action is "allow" so agents without explicit
recipe rules see everything.
Nothing consumes this service yet — Phase 4 wires in the load_recipe
tool and Phase 5 injects the fmt(list, { verbose: true }) block into
the system prompt.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The old Skill service (src/skill/index.ts) was a working opencode-style markdown loader registered in the app layer but never read by any consumer — the planner operates on A2A SkillRequest objects from the /plan body, not markdown files. Phase 1 moved its only cross-module use (splitFrontmatter, borrowed by the agent loader) into a shared util, so the module is now fully orphaned. Deleted src/skill/. Swapped Skill.defaultLayer → Recipe.defaultLayer in the Level-1 layer merge. Typecheck clean, 154/154 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single tool the planner calls to pull a recipe's full body into context. Metadata (name + description) stays in the system prompt via Recipe.fmt; this tool is the lazy-load side of the disclosure. Design choices vs opencode's SkillTool: - Plain fs for bundled-file enumeration (no ripgrep dep). Two-level recursive walk, capped at 10 entries — enough for scripts/ and reference/ subdirs, shallow enough to avoid accidental node_modules inclusion. - Bundled-file scan only runs for nested recipes (recipes/foo/RECIPE.md); flat recipes (recipes/foo.md) would otherwise surface OTHER recipes as siblings. - ctx.ask is optional on ToolContext today (session/prompt.ts wrapTool doesn't set it), so the permission gate is a no-op. Kept the call site so a Phase-2 permission UI inherits recipe gating with zero code change. - Dynamic description is computed from the permission-filtered available list passed in by the planner — the LLM only sees recipes it's allowed to load. Not wired into the planner yet — that's Phase 5. Typecheck clean, 154/154 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Planner now:
1. Pulls the permission-filtered recipe list at plan start via
recipes.available(plannerAgent).
2. Registers load_recipe as one of the session's dynamic tools, with
its description rendered from the same filtered list.
3. Injects Recipe.fmt(list, { verbose: true }) into the system prompt
between the agent prompt and config.instructions — but only when
the list is non-empty, so a clean gateway with no recipes produces
no noise in the prompt.
PromptInput gained a recipeSummary?: string field; buildSystemPrompt
accepts it as an optional third argument. No other call sites — the only
consumer is the planner.
End-to-end: system prompt tells the LLM which recipes exist; load_recipe
tool makes the body materializable on demand. Progressive disclosure
complete.
Typecheck clean, 154/154 tests pass. Integration tests for the
end-to-end flow land in Phase 7.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…red flow Two real, usable playbooks that exercise the full progressive-disclosure path from loader → system prompt → load_recipe tool: multi-agent-research: instructs the planner how to chain a search agent and a summarizer agent, which A2A task states it can see between them, and where to stop vs. where to wait for the user. payment-required-flow: documents the recurring gotcha that payment-required is a paused non-terminal state, not a failure. No retries, no speculation, surface the payment URL verbatim and end the turn. This mirrors the guidance the project CLAUDE.md surfaces from past PRs. Smoke-loaded both through loadRecipesDir + fmt to verify parsing and rendering — name, description, tags, triggers and both verbose/terse formats come out clean. Full integration test coverage lands in Phase 7. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two new test files, 20 new tests (total now 174/174 passing): tests/recipe/loader.test.ts — 12 tests. Covers flat + bundled layout discovery, alphabetical sort, cross-layout duplicate detection, empty-description rejection, name-fallback to filename stem, missing-directory behavior, tag/trigger parsing, and both fmt() output modes (verbose XML + terse markdown). tests/recipe/tool.test.ts — 8 tests. Covers describeRecipe with filtered lists, unknown-name errors that include the available list, <recipe_content>/<recipe_files> envelope for flat and bundled recipes, the 10-entry enumeration cap, and the ctx.ask permission hook contract. Dropped the planner-integration test from the original plan. The wiring (planner → load_recipe tool + recipeSummary in PromptInput) is 16 visible lines; a real integration test would need to mock the LLM provider, Session.Service, DB, and Bus, which costs more than it catches. The loader and tool contracts cover the interesting surface; the wiring itself is too trivial for its own integration test at this scope. Typecheck clean, 20/20 recipe tests pass, 174/174 overall. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gateway/README.md:
- Fix stale bullet: "Tool registry + Skill/Agent loaders" →
"Tool registry + Agent/Recipe loaders (progressive-disclosure
playbooks)" — the Skill loader was removed in this branch, and the
status list needs to reflect what's actually shipped.
- New §Recipes section: what they are, why you'd write one, the flat
vs. bundled layout, frontmatter shape, per-agent visibility via
the existing permission system, and the end-to-end load path.
Points at src/recipe/index.ts and src/tool/recipe.ts for source.
CLAUDE.md:
- Append a Recent Learnings entry so future Claude sessions know
recipes exist and why they're named "recipe" (the skill namespace
was already taken by A2A SkillRequest).
Typecheck clean, 174/174 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (14)
📝 WalkthroughWalkthroughThe pull request introduces a new recipe system to the gateway, replacing the prior skill-loading subsystem. Recipes are markdown playbooks with YAML frontmatter stored in Changes
Sequence DiagramsequenceDiagram
participant Planner as Planner
participant RecipeService as Recipe Service
participant PromptBuilder as Prompt Builder
participant Tool as load_recipe Tool
Planner->>RecipeService: available(agent)?
activate RecipeService
RecipeService->>RecipeService: filter by permission
RecipeService-->>Planner: [available recipes]
deactivate RecipeService
Planner->>PromptBuilder: buildSystemPrompt(agent, instructions, recipeSummary)
activate PromptBuilder
PromptBuilder->>PromptBuilder: inject recipe list into system prompt
PromptBuilder-->>Planner: system prompt + recipe block
deactivate PromptBuilder
Planner->>Planner: build load_recipe tool description
Planner->>Tool: define(load_recipe, ...)
Planner->>Planner: invoke LLM with recipes in context
Planner-->>Tool: invoke load_recipe({ name })
activate Tool
Tool->>RecipeService: get(name)
RecipeService-->>Tool: recipe content + metadata
Tool->>Tool: enumerate bundled files (if nested)
Tool->>Tool: build <recipe_content> envelope
Tool-->>Planner: recipe_content + recipe_files
deactivate Tool
Planner->>Planner: consume recipe in planning
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Port OpenCode's "skills" pattern into the Bindu gateway, renamed to recipes to avoid collision with A2A
SkillRequest(an agent capability the external caller hands in via/plan). Recipes are markdown playbooks the planner lazy-loads on demand — metadata in the system prompt, body fetched via aload_recipetool only when the planner decides a recipe applies.Key benefit: the gateway can now carry durable, operator-authored orchestration playbooks (multi-agent flows, A2A state-handling rules, tenant policies) without bloating the planner's system prompt with every instruction at once.
What this PR changes
gateway/src/recipe/index.ts— loadsgateway/recipes/*.mdandgateway/recipes/<name>/RECIPE.md, exposeslist/get/available(agent)/dirs/fmt. Permission-filtered via the existingRulesetevaluator.gateway/src/tool/recipe.ts—load_recipe({ name })returns a<recipe_content>envelope with the full markdown and a<recipe_files>listing for bundled assets. Plain-fsenumeration (no ripgrep dep), 2-level recursion, capped at 10 files.gateway/src/planner/index.ts— per-plan, registersload_recipeand injectsRecipe.fmt(list, { verbose: true })into the system prompt via a newrecipeSummary?: stringonPromptInput. Empty list → no noise in the prompt.gateway/recipes/—multi-agent-research(chaining a search agent and a summarizer, plus A2A state handling) andpayment-required-flow(the "don't retry silently" gotcha).gateway/src/skill/index.tswas an opencode-style markdown loader that was registered in the app layer but never consumed. Deleted. Its only utility export (splitFrontmatter) was factored out togateway/src/_shared/util/frontmatter.tsfirst so the agent loader loses its accidental dep.§Recipessection ingateway/README.mdand a Recent Learnings entry in the projectCLAUDE.md.Why "recipe" and not "skill"
The gateway already uses
skillfor the A2ASkillRequestobject — an agent capability surfaced on the/planrequest body. Overloading the word for a second orthogonal concept would guarantee confusion. Renamed everywhere upfront.Test plan
cd gateway && npm run typecheck— cleancd gateway && npm test— 174/174 pass (20 new, 154 baseline)tests/recipe/loader.test.ts(12) — flat/bundled discovery, duplicate-name errors, sort stability, empty-description rejection, name-fallback, tag/trigger parsing, fmt output modestests/recipe/tool.test.ts(8) — description content, unknown-name errors, envelope shape for flat and bundled recipes, 10-entry enumeration cap,ctx.askpermission hookloadRecipesDir+fmt, verified XML + markdown output/plansmoke against a live gateway + OpenRouter — deferred; requires env (Supabase, OpenRouter, Hydra). Covered implicitly by the two unit test files.Scope calls made
ctx.askis optional onToolContexttoday and unset bywrapToolinsession/prompt.ts. The tool calls it when present, a no-op otherwise. Phase 2 permission UI inherits recipe gating with zero code change.skills.urls) for now. Flat + nested filesystem layouts only. A Bindu recipe registry can be added later without breaking this.Out of scope / follow-ups
Branch history
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation