Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
339 changes: 0 additions & 339 deletions PLAN.md

This file was deleted.

23 changes: 14 additions & 9 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,35 +241,40 @@ For `queue_full`: `INSERT INTO audit … status='error', error_message='queue_fu
`skills.ts` adds a filesystem-discovered command surface on top of the `commands.ts` dispatcher. Skills are operator-authored `SKILL.md` files under `$SOLRAC_SKILLS_DIR/<name>/` discovered ONCE at boot (no hot-reload, matching the boot-once config story). Disabled by default (`SOLRAC_SKILLS_ENABLED=false`).

**Boot sequence (in `main.ts`).**
1. Load skills sync: `loadSkillsSync(config.skillsDir, BUILT_IN_NAMES)` — fail-soft. Each malformed `SKILL.md` adds an error to the result; valid ones populate the registry. Missing directory → empty registry + one error (NOT a crash).
2. Telegram autocomplete: `setMyCommands([...BOT_COMMAND_REGISTRY, ...skillsToBotCommands(registry.all)])` — built-ins first, skills after.
1. Load skills sync: `loadSkillsSync(config.skillsDir, BUILT_IN_NAMES, config.defaultEngine, loadedIntegrationNames)` — fail-soft. Each malformed `SKILL.md` adds an error to the result; valid ones populate the registry. Missing directory → empty registry + one error (NOT a crash). `loadedIntegrationNames` is the set of integrations that registered ≥1 tool (a probe-failed integration with `toolCount === 0` does NOT count); it gates skills that declare `requires:` (see below).
2. Telegram autocomplete: `setMyCommands([...BOT_COMMAND_REGISTRY, ...skillsToBotCommands(registry.all)])` — built-ins first, skills after. Skills filtered by `requires:` are absent from this payload, so operators never see autocomplete suggestions that fail at use-time.
3. Plumb the registry into `commandDeps` and `parseCommand`.

**Parser hook.** After the `KNOWN_COMMANDS` check misses, `parseCommand` looks up the name in the registry. A hit returns `{ kind: "skill", skill, args }`. A miss falls through to the existing unknown-vs-passthrough logic (`/` → unknown, `:` → passthrough). Built-ins always win — even if a buggy registry returned a skill named `clear`, the built-in arm fires first.

**Frontmatter schema (4 fields).**
**Frontmatter schema (2 required + 4 optional).**
- `name` — required, matches `[a-z0-9_]{1,32}`, must NOT collide with built-in names (rejected at load time).
- `description` — required, ≤256 chars (used in `setMyCommands` payload + `/help` rendering, and as the tool description when `tool: true`).
- `tier` — optional, `primary` | `secondary` | `ollama`. Defaults to `SOLRAC_DEFAULT_ENGINE` so an Ollama-default deploy gets free skills automatically. Explicit `tier: ollama` is rejected when the deploy default isn't ollama (PR-B removed the `>` prefix).
- `max_turns` — optional, integer in `[1, 10]`, default `1`. Model-turn budget for the skill body. Doubles as the SDK `maxTurns` on Claude tiers and as `runToolLoop`'s `maxIterations` on the Ollama tier — the operator gets one knob that constrains both paths uniformly.
- `tool` — optional boolean, default false. When true, exposes the skill as a callable MCP tool to the Ollama agent (Phase 1 restriction: `tool: true` requires `tier: ollama`).
- `requires` — optional, bare string or string array (entries match `[a-z][a-z0-9_-]{0,31}`). Integration dependencies. When any name is absent from `loadedIntegrationNames` at boot, the loader skips the skill with a non-fatal `skills.load_error` and the registry never sees it. `/help` and Telegram autocomplete are filtered by the same registry, so the operator never gets advertised a skill that would fail at use-time. Empty / omitted → unconditional load (preserves back-compat for pre-`requires:` skills).

The body is a prompt template; `{{args}}` is the only placeholder and is replaced literally with the user's text after the command name (or with the agent-supplied `args` argument when called as a tool). The frontmatter parser is a homemade YAML subset (~70 LOC in `skills.ts`) — handles `key: scalar`, `key: [a, b, c]`, quoted strings, integers, booleans. Adding `js-yaml` for a 4-key schema was disproportionate.
The body is a prompt template; `{{args}}` is the only placeholder and is replaced literally with the user's text after the command name (or with the agent-supplied `args` argument when called as a tool). The frontmatter parser is a homemade YAML subset in `skills.ts` — handles `key: scalar`, `key: [a, b, c]`, quoted strings, integers, booleans. Adding `js-yaml` for a 6-key schema was disproportionate.

**Skill execution.** The path forks on `tier`:

- **Claude tiers (`primary` / `secondary`).** `runSkill` in `commands.ts`. Mirrors `runCompactTurn`'s structural pattern: pre-flight cost cap (chat + global; cap-rejected skills cost $0), `query()` with `maxTurns: 1`, no `resume`, `disallowedTools` deny-list, `canUseTool: deny-all`. Audit row tagged `claude:<tier>:<model>:skill:<name>`.
- **Ollama tier.** `runOllamaSkill` (and the bare `runSkillBare` helper) in `commands.ts`. One-shot `/api/chat` (`stream: false`), no history, no SOLRAC.md overlay, no tool loop, no streaming stub. Audit row tagged `ollama:<model>:skill:<name>` with `cost_usd: 0`. Pre-flight cap is skipped (a chat throttled by Claude burn shouldn't lose access to free local inference).
- **Claude tiers (`primary` / `secondary`).** `runSkill` in `commands.ts`. Pre-flight cost cap (chat + global; cap-rejected skills cost $0), then `query()` with `maxTurns: skill.maxTurns`, no `resume` (fresh isolated turn), `tools: { type: "preset", preset: "claude_code" }`, `disallowedTools: ["Agent","Task"]` (sub-agents off; belt-and-suspenders with `policy.ts::SUBAGENT_DENY_TOOLS`). The interactive `canUseTool` factory + `PreToolUse` / `PostToolUse` / `PostToolUseFailure` hooks come from `deps.createCanUseTool` / `policy.ts` — same instances `runAgent` uses, so cost cap, loop detector, and the Telegram-confirm UX behave identically inside a skill. When integrations are loaded, `deps.mcpServer` is wired so the body sees `mcp__solrac__<name>` tools too. Audit row tagged `claude:<tier>:<model>:skill:<name>`; mid-turn cap or loop denials get promoted into `error_message` as `policy_deny:<reason>: …`.
- **Ollama tier.** `runOllamaSkill` (and the bare `runSkillBare` helper) in `commands.ts`. The helper dispatches on whether `OllamaSkillDeps` has `tools + toolTiers + broker` wired:
- **Tools wired** → `runSkillBareWithTools` routes the body through the same `runToolLoop` driver that `runOllamaTurnWithTools` uses. `maxIterations = skill.maxTurns`, fresh loop detector, full `mcp__solrac__*` + `skills__*` catalog with the skill's own `skills__<self>` entry filtered out (recursion guard — see below). No history, no SOLRAC.md overlay, no streaming stub.
- **Tools absent** → fall through to a single-shot `/api/chat` (`stream: false`). Preserves pure text-transform skills (no `requires:`, `max_turns: 1`) at minimum latency.
Either way: audit row tagged `ollama:<model>:skill:<name>` with `cost_usd: 0`. Pre-flight Claude cap is skipped (a chat throttled by Claude burn shouldn't lose access to free local inference).

Reply for both: model output verbatim, HTML-escaped, truncated to ≈3,500 chars (Telegram per-message ceiling minus headroom). Skills are tool-less across both engines — the Claude path denies every tool call via `canUseTool`; the Ollama path posts to `/api/chat` with no `tools` field (a load-bearing invariant for skills-as-tools recursion safety; see below).
Reply for both: model output verbatim, HTML-escaped, truncated to ≈3,500 chars (Telegram per-message ceiling minus headroom). The Ollama path's `runOllamaSkill` wraps the call in `skillToolCtx.run({chatId, fromId, updateId, parentAuditId}, ...)` so any nested `skills__*` invocation inherits the chat context for its own audit row.

**Skills as tools (Phase 1: Ollama-only).** A skill with `tool: true` is also exposed as a callable MCP tool to the Ollama agent (`skill-tools.ts::buildSkillTools`). The model sees it in its tool catalog as `mcp__solrac__skills__<name>` (wire format on Ollama: `skills__<name>`) with the operator-authored description. Tool dispatch:
**Skills as tools (Phase 1: Ollama-only).** Distinct axis from "skills using tools" above — *that* is shipped on both tiers. *This* is whether the Ollama agent can call a skill **by name** as a tool entry in its catalog. A skill with `tool: true` is exposed as a callable MCP tool to the Ollama agent (`skill-tools.ts::buildSkillTools`). The model sees it in its tool catalog as `mcp__solrac__skills__<name>` (wire format on Ollama: `skills__<name>`) with the operator-authored description. Tool dispatch:

1. **Catalog merge.** At boot, eligible skills (`tool: true && tier: ollama`) become `SdkMcpToolDefinition` entries with input schema `{ args: string }`. They're merged into `integrationTools` and `integrationToolTiers` (all `auto`-allow) before `ollamaDeps` is constructed.
2. **Per-turn context propagation.** `runOllamaTurnWithTools` wraps the loop in `skillToolCtx.run({chatId, fromId, updateId, parentAuditId}, () => runToolLoop(...))`. The skill handler reads the store via `AsyncLocalStorage.getStore()` — needed because the SDK tool-handler signature `(args, extra) => ...` leaves no slot for chat context, and concurrent turns require race-free context (the queue runs N chats in parallel). ALS is the standard Node primitive for this.
3. **Handler.** Reads ALS context, calls `runSkillBare`, writes a fresh audit row with `origin='tool_call'` so operators can distinguish agent-driven invocations from operator-typed `/<skill>` calls (`origin='user'`). Returns the model's text as the tool result; the parent Ollama turn composes its final user-facing reply on top.
4. **Permission tier.** Auto-allow. Cost cap is the backstop (Phase 1 ollama skills are free; Phase 2 unlocks Claude-tier skills with a per-skill cost cap).

**Recursion safety (load-bearing).** A skill called as a tool MUST NOT call any tool itself — otherwise an Ollama agent could trigger `skills__foo` which calls `skills__foo` infinitely. `runSkillBare` posts to `/api/chat` with no `tools` field, so the sub-call has no tool surface. `skill-tools.test.ts` asserts the outgoing fetch body has no `tools` key — a regression breaks CI before production.
**Recursion safety (load-bearing).** A skill body must not be able to call itself. Direct recursion is prevented by filtering the skill's own `skills__<self>` entry out of the MCP catalog `runSkillBareWithTools` hands to `runToolLoop` (see `commands.ts::runSkillBareWithTools`). Indirect recursion (A → `skills__B` → `skills__A`) is bounded by two backstops in series: `skill.maxTurns` caps the tool-loop iterations for each invocation, and the shared loop detector (`policy.ts::createLoopDetector`, fresh per invocation, threshold 3 identical `(tool, input)` calls) trips before a deep cycle materializes. `skill-tools.test.ts` asserts the self-filter; a regression breaks CI before production.

**Phase 2 deferred.** Cross-engine tool calls (Ollama agent → Sonnet skill) would land via the same SDK MCP server already used for integrations. Phase 1's ollama-tier restriction sidesteps the cost-escalation question (a misbehaving Ollama agent calling a `tier: primary` skill 100× would burn real $$$). When Phase 2 lands, expect a per-skill cost cap and a `confirm`-tier option for Claude-backed tool calls.

Expand Down
2 changes: 1 addition & 1 deletion docs/FEATURES.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ The complete feature list, grouped by theme. See [../README.md](../README.md) fo

- **Customizable persona via `SOUL.md` + `SOLRAC.md`** — two operator-editable markdown files at the launch directory. `SOUL.md` (voice, stance, safety) ships with the package and is read once at boot. `SOLRAC.md` (operator overlay: who runs it, channel posture, project context) is re-read every turn so live edits land on the next message without a restart. See [USAGE.md#customizing-solrac-soulmd-and-solracmd](./USAGE.md#customizing-solrac-soulmd-and-solracmd).
- **Slash commands** — `/help`, `/status`, `/context`, `/clear`, `/compact` give the operator visibility and control over conversation context, spend, and session state without leaving Telegram. Both `/cmd` and `:cmd` invoke the same handler (`:` avoids Telegram's auto-link on bold text).
- **Operator-defined skills** — drop a `SKILL.md` into `$SOLRAC_SKILLS_DIR/<name>/` and that filename becomes a slash command on the next boot. Tool-less single-turn prompts with `{{args}}` templating; tier defaults to `SOLRAC_DEFAULT_ENGINE` (free on Ollama deploys); cost-capped under the existing per-chat hourly budget. Optional `tool: true` frontmatter exposes the skill as a callable MCP tool to the local Ollama agent (Phase 1: `tier: ollama` only) so natural-language requests can route through your prompts. Off by default; enable with `SOLRAC_SKILLS_ENABLED=true`.
- **Operator-defined skills** — drop a `SKILL.md` into `$SOLRAC_SKILLS_DIR/<name>/` and that filename becomes a slash command on the next boot. `{{args}}` templating; per-skill `max_turns` (1–10) so a single-shot text transform stays bounded while an agentic skill (e.g. `notion_search` → `notion_create_page`) gets headroom; the body runs with the same Claude Code tool preset (Claude tiers) or integrations MCP catalog (Ollama tier) as a normal turn, under the same three-tier policy, cost cap, and loop detector. Optional `requires:` frontmatter gates a skill on named integrations being loaded at boot — missing deps → skill skipped, never appears in `/help` or autocomplete. Optional `tool: true` exposes the skill as a callable MCP tool to the local Ollama agent (Phase 1: `tier: ollama` only) so natural-language requests can route through your prompts. Off by default; enable with `SOLRAC_SKILLS_ENABLED=true`.
- **Scheduled tasks** — drop a `TASK.md` into `$SOLRAC_TASKS_DIR/<name>/` and the prompt fires on its configured schedule (`every 1h`, `daily_at 09:00`, `at 2026-05-15T13:00:00Z`) into a configured chat. Engine inheritance (defaults to `config.defaultEngine`), per-task `max_cost_usd`, boot catch-up jitter; fires synthesize updates through the same turn queue so all existing safety machinery applies. `/tasks` lists loaded tasks with last + next fire; `/tasks run <name>` triggers on demand. Off by default; enable with `SOLRAC_TASKS_ENABLED=true`. See [USAGE.md#scheduled-tasks](./USAGE.md#scheduled-tasks).

## Transport
Expand Down
2 changes: 1 addition & 1 deletion docs/GLOSSARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ Terms that recur across Solrac's codebase and docs. Alphabetical.

**skill** — User-level Claude Code skill in `.claude/skills/<name>/SKILL.md`. Available to the agent via the SDK's preset systemPrompt + tool routing. v1 doesn't enumerate skills explicitly in the systemPrompt — that's [OQ#11](./ROADMAP.md#oq11-skill-router).

**Solrac skill (operator-defined)** — Distinct from the Claude Code skill above. A `SKILL.md` file under `$SOLRAC_SKILLS_DIR/<name>/` that defines a Telegram slash command (`/<name>`) without code changes. Loaded ONCE at boot by `skills.ts::loadSkillsSync`; runs as a single tool-less turn (`runSkill` in `commands.ts`). Tier defaults to `SOLRAC_DEFAULT_ENGINE` so an Ollama-default deploy gets free skills automatically. Optional `tool: true` frontmatter additionally exposes the skill as a callable MCP tool to the Ollama agent — see **skill tool**. Disabled by default (`SOLRAC_SKILLS_ENABLED=false`). See [USAGE.md#skills-operator-defined-commands](./USAGE.md#skills-operator-defined-commands).
**Solrac skill (operator-defined)** — Distinct from the Claude Code skill above. A `SKILL.md` file under `$SOLRAC_SKILLS_DIR/<name>/` that defines a Telegram slash command (`/<name>`) without code changes. Loaded ONCE at boot by `skills.ts::loadSkillsSync`; runs via `runSkill` (Claude tiers) or `runOllamaSkill` (Ollama) in `commands.ts`. The body sees the same tool surface a normal turn does (Claude Code preset on Claude tiers; the integrations MCP catalog via `runToolLoop` on Ollama when tools are wired) — bounded by the per-skill `max_turns` frontmatter (default 1, max 10) and constrained by the same three-tier policy + cost cap + loop detector + `canUseTool` confirm UX as a regular turn. Tier defaults to `SOLRAC_DEFAULT_ENGINE` so an Ollama-default deploy gets free skills automatically. Optional `requires:` frontmatter gates the skill on named integrations being loaded at boot (missing deps → silently absent from `/help` + autocomplete). Optional `tool: true` additionally exposes the skill as a callable MCP tool to the Ollama agent — see **skill tool**. Disabled by default (`SOLRAC_SKILLS_ENABLED=false`). See [USAGE.md#skills-operator-defined-commands](./USAGE.md#skills-operator-defined-commands).

**skill tool** — A Solrac skill with `tool: true` frontmatter, exposed to the Ollama agent's tool catalog as `mcp__solrac__skills__<name>` (wire format on Ollama: `skills__<name>`). The model decides when to call it from natural language; the tool description is `skill.description`; input schema is `{ args: string }`. Phase 1 restriction: requires `tier: ollama` (free, no cross-engine cost surprises). Auto-allow permission tier; cost cap is the backstop. Built by `skill-tools.ts::buildSkillTools`. Per-turn context (chatId, fromId, updateId, parentAuditId) propagates via `node:async_hooks::AsyncLocalStorage` (`skillToolCtx`) — the SDK tool-handler signature `(args, extra)` leaves no slot for chat context, and concurrent turns require race-free isolation. Audit row tagged `origin='tool_call'` to distinguish from operator-typed slash invocations.

Expand Down
8 changes: 6 additions & 2 deletions docs/ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -316,9 +316,13 @@ Trade-off: every token in systemPrompt ships on every turn. If the registry is 5

**Status:** Phase 1 shipped (Ollama-only). See `src/skill-tools.ts` and [USAGE.md#skills-as-tools-phase-1-ollama-only](./USAGE.md#skills-as-tools-phase-1-ollama-only).

A `SKILL.md` with `tool: true` frontmatter is exposed to the Ollama agent's tool catalog as `mcp__solrac__skills__<name>`. The model decides when to call from natural language; the description is `skill.description`; the schema is `{ args: string }`. Auto-allow tier; cost cap is the backstop. Phase 1 restricted to `tier: ollama` skills (free) to sidestep the cost-escalation question. Audit row tagged `origin='tool_call'`.
Two distinct axes — kept separate because they have different cost-exposure shapes:

**Phase 2 (deferred).** Expose to Claude tiers via the existing `solrac` MCP server. Lift the `tier: ollama` restriction; add per-skill cost cap; consider `confirm`-tier gating on Claude-backed tool calls so a runaway Ollama agent can't burn $$$ silently.
1. **Skills *using* tools (shipped on both tiers).** A skill body — Claude or Ollama — runs with the same tool surface a regular turn does (Claude Code preset on Claude; integrations MCP catalog on Ollama). Bounded by per-skill `max_turns` frontmatter (1–10, default 1) and the same three-tier policy + cost cap + loop detector. Pure text-transform skills stay cheap with `max_turns: 1`; agentic skills (`/log` chaining `notion_search` → `notion_create_page`) declare what they need.

2. **Skills *callable as* tools by the agent (Phase 1: Ollama-only).** A `SKILL.md` with `tool: true` frontmatter is exposed to the Ollama agent's tool catalog as `mcp__solrac__skills__<name>`. The model decides when to call from natural language; the description is `skill.description`; the schema is `{ args: string }`. Auto-allow tier; cost cap is the backstop. Phase 1 restricted to `tier: ollama` skills (free) to sidestep the cost-escalation question (a misbehaving Ollama agent calling a `tier: primary` skill 100× would burn real $$$). Audit row tagged `origin='tool_call'`.

**Phase 2 (deferred) — axis 2 expansion.** Expose tool-callable skills to Claude tiers via the existing `solrac` MCP server. Lift the `tier: ollama` restriction on `tool: true`; add a per-skill `max_cost_usd` cap separate from the chat-level cap; consider `confirm`-tier gating on Claude-backed tool-callable skills so the operator approves each cross-engine escalation.

**Phase 3 (deferred).** Streamed skill output (currently the agent waits for the full skill reply before continuing); per-skill telemetry surface in `/status` or a dedicated `/skills` slash command.

Expand Down
Loading
Loading