diff --git a/README.md b/README.md index 477fbfd..92c6e6e 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,7 @@ Opinionated Claude Code skills for repo hygiene: audit builds, tame CI noise, hu | `techne:deslop` | Scans comments and docstrings for AI-generated slop and proposes tightened rewrites. | | `techne:docs-site` | Maintains the Zensical-powered docs site: config, deploy pipeline, theming, link integrity. | | `techne:docsync` | Verifies documentation claims (CLI commands, paths, config keys, signatures) against the actual code. | +| `techne:research-grounded` | Flags design decisions in IMPL/ROADMAP that lack `# research(YYYY-MM):` provenance, then web-searches to ground them. | | `techne:reslop` | Rewrites docstrings grounded in the implementation rather than deleting them outright. | | `techne:sisters` | Cross-repo drift audit across the sister repos listed in `~/.claude/techne.toml`. | | `techne:theoros` | Starts an observed live dev session: Claude drives the REPL in a named `tmux` session; you spectate read-only via `tmux attach -r`. | diff --git a/ROADMAP.md b/ROADMAP.md index cbcb0f2..072a872 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -69,8 +69,6 @@ wild. audit. See [skill-ideas-from-audit-of-audit](./superpowers/plans/2026-05-21-skill-ideas-from-audit-of-audit.md) item #2. - **`/techne:positioning`** — identity-drift check (claim vs work). Same plans file, item #1. -- **`/techne:research-grounded`** — flag IMPL/ROADMAP design choices - missing `# research(YYYY-MM):` provenance. Same plans file, item #3. - **`/techne:workspace-orphans`** — content-bearing files outside the active sister perimeter. Same plans file, item #4. n=1 today; build when n≥2. @@ -122,6 +120,12 @@ wild. Detail lives in git history (`git log`) and the live skill code. This log is pruned once work is durably shipped. +- 2026-05-29 — **`/techne:research-grounded` skill.** Audits IMPL.md / ROADMAP.md for committed + design decisions (library / framework / pattern / architecture choices) that lack a + `# research(YYYY-MM):` provenance tag, then web-searches to ground them — closing the loop + that, skipped, turned an SSML capability bet into 5 revertable PRs. Judgment over grep: + descriptive "instead of" prose and hypotheticals are filtered out. Validated on kourai + (8/10 grep candidates correctly ignored, 2 genuine gaps surfaced). Sibling of `/techne:docsync`. - 2026-05-27 — **docsync cross-repo skill-context fix.** `/techne:docsync` audits a doc that may live in a different repo than CWD (e.g. `docsync ../velocity-fl/README.md`), but it loaded `.claude/skill-context.md` via a load-time `` !`cat …` `` injection that always diff --git a/docs/skills/research-grounded.md b/docs/skills/research-grounded.md new file mode 100644 index 0000000..b0af48b --- /dev/null +++ b/docs/skills/research-grounded.md @@ -0,0 +1,31 @@ +# `techne:research-grounded` + +Flag design decisions in a plan — library, framework, pattern, and architecture choices — that lack a `# research(YYYY-MM):` provenance comment, then web-search to ground them before they harden into code. + +## When to use + +- Before a plan's decisions turn into commits: "are these library / architecture choices grounded in current best practice, or training-cutoff recall?" +- After drafting IMPL.md / ROADMAP.md entries that pick a tool, pattern, or protocol. +- When a decision is stated as fact ("X supports Y") that was never verified against the target's current docs — the class of miss that becomes revertable work. +- Auditing `planning/` design notes for un-cited architectural choices. + +## Usage + +Invoke by name in Claude Code: + +``` +/techne:research-grounded +``` + +Default scope is `IMPL.md` + `ROADMAP.md` in the current repo. Narrow or redirect by naming a file or directory: + +``` +/techne:research-grounded planning/ +``` + +The skill greps for decision language only as a candidate seed, then reads around each hit to keep the genuine technology / pattern / architecture choices — descriptive prose ("reads X instead of Y") and hypotheticals ("what if we used…") are filtered out. For each committed decision lacking an adjacent `research(YYYY-MM):` tag, it reports a `decision → gap → action` entry and, on confirmation, web-searches current best practice and adds the provenance tag. It never invents a citation. + +## See also + +- [`techne:docsync`](docsync.md): verifies documentation claims against the code — this skill is the sibling that verifies design *decisions* against *evidence*. +- [`techne:deslop`](deslop.md): tightens AI-slop prose, a different kind of doc-quality pass. diff --git a/plugins/techne/skills/research-grounded/SKILL.md b/plugins/techne/skills/research-grounded/SKILL.md new file mode 100644 index 0000000..3231309 --- /dev/null +++ b/plugins/techne/skills/research-grounded/SKILL.md @@ -0,0 +1,81 @@ +--- +name: research-grounded +description: Audit a plan's design decisions — library, framework, pattern, and architecture choices in IMPL.md / ROADMAP.md — for missing `# research(YYYY-MM):` provenance, then web-search to ground them. Use when you want architectural choices verified against current best practice before they harden into code, rather than resting on training-cutoff recall. Catches committed decisions stated as fact but never checked — the class of miss that becomes revertable work. +disable-model-invocation: false +allowed-tools: Bash Glob Grep Read Edit WebSearch WebFetch Agent +--- + +# Research-grounded + +Catch design decisions that were made without checking current best practice — *before* they become code. A planning doc that says "we'll use X" with no provenance is a bet on training-cutoff recall; the convention `# research(YYYY-MM): ` records that the bet was checked against reality. This skill finds the bets that weren't. + +## Repo context + +`/techne:research-grounded` audits docs in the CWD repo by default, or a path you name (which may live in another repo). Resolve the repo root from the argument: + +```bash +git -C "$(dirname "")" rev-parse --show-toplevel # no path arg → CWD repo root +``` + +The target's `.claude/skill-context.md` isn't required, but its `## repo` section helps you tell a genuine technology choice from incidental prose. + +## What needs provenance + +A **committed design decision** — a choice with real alternatives, where picking wrong is expensive to undo: + +- **Library / framework / tool** — "MapLibre + PMTiles", "Playwright over Cypress", "switched from JWT to server sessions". +- **Pattern / technique** — "event delegation over per-node listeners", "stale-while-revalidate", "container queries for card internals". +- **Architecture / protocol** — "A2A `Message.metadata` instead of text tags", "OPFS for tile storage". +- **Version / API-surface bets stated as fact** — "ElevenLabs v3 supports SSML break tags" (the miss that cost 5 revertable PRs: a capability asserted, never verified against the target's docs). + +Each of these, when committed in IMPL/ROADMAP, should carry an adjacent `# research(YYYY-MM): `. Negative-space decisions ("we did **not** do X because…") need provenance too. + +## Ignore + +- **Descriptive implementation prose.** "reads the metrics object instead of re-reading constants", "returns a structured fallback instead of a retry loop" — uses decision words but picks no technology or pattern. Not research-worthy. +- **Hypotheticals & aspirations.** "what if we used sessions instead of JWT", "we plan to", "coming soon" — not yet a commitment (mirrors how `/techne:docsync` waves through roadmap language). +- **Already grounded.** A decision with a `research(...)` tag on the same bullet or within a few lines — even a terse one. +- **Bugfixes, renames, mechanical changes** — no alternative space to research. +- **Settled house conventions** — the fleet's own recorded standards (SHA-pinning, squash-merge). Don't re-flag every mention. + +## Workflow + +1. **Scope.** Default: `IMPL.md` + `ROADMAP.md` in the target repo. Accept a named file or a directory (e.g. `planning/`). For multiple files, fan out — one `Explore` subagent per file (see [Fan-out](#fan-out-pattern)). + +2. **Find candidate decisions.** Seed with a grep, then *read around each hit* — the grep finds candidates; judgment decides which are research-worthy per the lists above. + + ```bash + grep -nEi 'chose|switched|adopted|replaced|opted for|we use|picked|migrated|over [A-Za-z.-]+ \(|instead of' ROADMAP.md IMPL.md + ``` + + **Don't trust the grep.** Most `instead of` hits are descriptive prose, not technology choices — classify per-hit, not per-match. + +3. **Check provenance.** For each genuine decision, look for a `research(YYYY-MM):` (or `# research:`) tag on the same bullet or within a few lines. Present → grounded, skip. Absent → gap. + +4. **Report gaps.** One entry per un-grounded decision: + + ``` + ROADMAP.md:88 + decision: "embed offline maps with MapLibre + PMTiles" + gap: no research provenance — grounded in 2026 offline-map options, or recall? + action: web-search current PMTiles/MapLibre tradeoffs → add `# research(2026-05): ` + ``` + +5. **Ground or flag (on confirmation).** + - **Ground it** (the point of the skill — close the loop, don't just flag): web-search the decision's current best practice, then add `# research(YYYY-MM): ` capturing what you actually found. + - **Flag only:** if grounding needs the user's domain context a quick search can't supply, leave a one-line note to ground it *before* it hardens into code. + - **Never invent a citation.** A `research()` tag with a fabricated or unread source is worse than no tag. + +## Fan-out pattern + +For a directory (e.g. `planning/`), spawn one `Explore` subagent per file; each returns a gap report for its file. Consolidate, present grouped by file, then ask `ground all / ground selected / skip?`. + +## Don't touch + +- Docs the user didn't name or imply; git history / changelogs (historical, not live bets). +- `research()` tags that already exist — don't reword a grounded decision unprompted. +- Decisions already shipped as code with no alternative left to research — audit the plan, not the past. + +## Why this skill is quiet + +Output is the gap report and, on confirmation, the web-searched `research()` tags. No narration of the scan — just `decision → gap → action`. Sibling of `/techne:docsync` (claims vs code); this is decisions vs evidence. diff --git a/zensical.toml b/zensical.toml index 7cb23cd..fa7e89a 100644 --- a/zensical.toml +++ b/zensical.toml @@ -26,6 +26,7 @@ nav = [ { "deslop" = "skills/deslop.md" }, { "docs-site" = "skills/docs-site.md" }, { "docsync" = "skills/docsync.md" }, + { "research-grounded" = "skills/research-grounded.md" }, { "reslop" = "skills/reslop.md" }, { "sisters" = "skills/sisters.md" }, { "theoros" = "skills/theoros.md" },