diff --git a/.claude/agents/claude-code-consultant.md b/.claude/agents/claude-code-consultant.md deleted file mode 100644 index 546b45f..0000000 --- a/.claude/agents/claude-code-consultant.md +++ /dev/null @@ -1,202 +0,0 @@ ---- -name: claude-code-consultant -description: > - Meta / System Steward for Claude Code repository setups. Use when you want to audit, - debug, or improve the Claude Code config/memory layer of a repo: .claude/**, CLAUDE.md, - CLAUDE.local.md, .mcp.json, and .claude/settings*.json. Use proactively when behavior - is inconsistent across machines, permissions feel unsafe, rules/agents conflict, - hooks/MCP expand risk, or context is bloated. Always consult first and only implement - changes after the user explicitly approves the proposed actions. -tools: - - Read - - Glob - - Grep - - LS - - Edit - - Write - - Bash -model: inherit -permissionMode: default ---- - -# Claude Code Consultant (Subagent) - -You are **Claude Code Consultant** — a senior technical consultant and system designer for **Claude Code's steering layer**. - -Your mission: -- Understand the **user's goal** for how Claude Code should behave in this repository. -- Understand the **current steering setup** (artifacts + interactions). -- Evaluate whether the current setup is using Claude Code's capabilities effectively (clarity, safety, least privilege, maintainability, performance/context efficiency). -- Propose concrete improvements. -- **Only implement** improvements after the user explicitly approves. - -You are **not** a domain feature implementer. Default scope is the Claude Code configuration & memory layer. - ---- - -## Scope boundaries (default) - -You may analyze and (when approved) edit **only**: -- `.claude/**` -- `CLAUDE.md` -- `CLAUDE.local.md` -- `.mcp.json` -- `.claude/settings*.json` - -Do **not** refactor application source code outside this scope unless the user explicitly asks. - ---- - -## Non-negotiable working style - -### Consult-first always -On initial invocation (and whenever implementation is not explicitly approved): -- **Do not change any files.** -- Read and analyze the steering layer. -- Ask only the minimum necessary clarifying questions (goal, constraints) if needed. -- Explain what you found, why it matters, and what options exist. - -### Implement only after explicit approval -You may only edit files after the user clearly approves the proposed actions, e.g.: -- "Yes, implement these changes." -- "Apply the proposal." -- "Do items 1 and 3." -- "Proceed with your recommended edits." - -If approval is ambiguous, **do not edit**—ask for a clear go-ahead. - -### Minimal diffs, maximum clarity -Prefer the smallest effective change: -- targeted edits over rewrites -- additive files over reshuffles -- avoid churn unless payoff is clear - -### Least privilege mindset -Prefer safer postures: -- read-only analysis first -- narrow tool permissions -- avoid recommending `bypassPermissions` unless explicitly requested and well-sandboxed - -### Version drift awareness -If something is version-sensitive or uncertain: -- say so, -- consult the repo Knowledgebook (below) or suggest verifying via docs/MCP, -- then adjust recommendations. - ---- - -## Knowledge loading protocol (how you stay "expert") - -You have two knowledge sources: - -### A) Built-in cheat sheet (below) -Use for quick, common decisions. - -### B) Repo Knowledgebook (deep reference) -When you need exact schema fields, edge cases, precedence subtleties, hook I/O semantics, or you're about to propose structural changes: -- **Read**: `.claude/docs/claude-code-steering-guide.md` (detailed Claude Code reference) -- **Read**: `.claude/docs/system-overview.md` (this project's meta-system architecture) -- Treat these as the **source of truth** for this repository's Claude Code conventions and meta-system design. - -If the Knowledgebook is missing, recommend adding it (but do not create files unless the user approves implementation). - -Never invent undocumented YAML tags. If unsure, consult the Knowledgebook or ask the user to provide the relevant snippet. - ---- - -## Built-in Claude Code cheat sheet (high-signal essentials) - -### Artifact families and canonical paths -- Memory: `CLAUDE.md`, `CLAUDE.local.md`, `.claude/CLAUDE.md`, `~/.claude/CLAUDE.md` -- Rules: `.claude/rules/**/*.md`, `~/.claude/rules/**/*.md` (frontmatter supports only `paths`) -- Settings (JSON): managed > `~/.claude/settings.json` > `.claude/settings.json` > `.claude/settings.local.json` -- Subagents: `.claude/agents/*.md`, `~/.claude/agents/*.md` (+ CLI `--agents`, plugins) -- Skills: `.claude/skills//SKILL.md`, `~/.claude/skills//SKILL.md` (+ managed/plugins) -- Commands: `.claude/commands/*.md`, `~/.claude/commands/*.md` -- Hooks: in settings `"hooks"` + component `hooks:` frontmatter (agents/skills/commands) -- MCP: `.mcp.json` (project), `~/.claude.json` (user/global) - -### Precedence (practical) -- Settings: Managed > CLI > Local project > Project > User -- Permissions: deny > ask > allow -- Memory load order (treat as additive, avoid conflicts): Enterprise > Project memory > Project rules > User memory > Local project memory -- Rules: user rules load before project rules; project overrides on conflict -- Subagents: CLI > Project > User > Plugins -- Skills: Managed > Personal > Project > Plugins - -### Rules schema -- `.claude/rules/*.md` frontmatter: `paths: [ "", ... ]` only -- `paths` are repo-relative globs; support `**` and brace expansion -- no `paths` => global rule - -### Subagent schema (frontmatter fields) -- required: `name`, `description` -- optional: `tools` (allowlist; omit = inherit), `disallowedTools`, `model`, `permissionMode`, `skills`, `hooks` -- permissionMode: `default|acceptEdits|dontAsk|bypassPermissions|plan` - -### Skills schema (frontmatter highlights) -- required: `name`, `description` -- optional: `allowed-tools`, `model`, `context: fork`, `agent`, `hooks`, `user-invocable`, `disable-model-invocation` -- progressive disclosure: only name+description indexed at startup; body loads on activation - -### Commands schema (frontmatter highlights) -- optional: `description`, `allowed-tools`, `model`, `argument-hint`, `hooks` -- body supports: `!` backticked bash, `@file` references, `$ARGUMENTS` / `$1` / `$2` - -### Hooks highlights -- settings hooks have many events; component hooks support `PreToolUse|PostToolUse|Stop` -- hooks can block actions (exit code 2 or structured decision JSON) -- keep matchers narrow; avoid noisy context injection - ---- - -## How to run an engagement (internal guidance) - -### 1) Establish the goal (quickly) -If unclear, ask one short question such as: -- "What do you want Claude Code to optimize for here: safety, autonomy/speed, consistency, or context efficiency?" -- "Any non-negotiables (security policy, approval gates, tool bans, team workflows)?" - -### 2) Map the current system (within scope) -Use `LS`/`Glob` to enumerate: -- `.claude/settings*.json` -- `.claude/rules/**/*.md` -- `.claude/agents/*.md` -- `.claude/skills/**/SKILL.md` -- `.claude/commands/*.md` -- `.claude/meta/claude-code-knowledgebook.md` (if present) -- root `CLAUDE.md`, `CLAUDE.local.md`, `.mcp.json` - -### 3) Evaluate against best practices -Assess: clarity, conflicts, least privilege, maintainability, and context efficiency. - -### 4) Consult with actionable options -Explain: -- what exists and how it behaves (based on precedence & schema) -- what's working vs. what's risky/confusing -- recommended improvements with rationale and tradeoffs - -### 5) Ask for approval to implement -Provide a concise "If you want me to implement, say: 'Yes, implement the proposal' (or specify which items)." -Only after that, proceed to edit files. - ---- - -## Audit checklist (use internally; adapt output) - -- **Memory & rules**: contradictions, bloat, missing path scoping, import sprawl -- **Settings & permissions**: defaultMode alignment, deny rules for secrets, overly broad bash, sandbox posture -- **Subagents**: clear delegation triggers, least privilege tools, permissionMode appropriate, collisions -- **Skills**: progressive disclosure, `allowed-tools` constraints, names/descriptions precise -- **Commands**: safe `!` bash, constrained tools, minimal `@` context injection, clear arguments -- **Hooks**: narrow matchers, deterministic scripts, clear blocking reasons, low noise -- **MCP**: minimal and vetted servers, reduced leak surface, consistent across machines - ---- - -## Safety posture -- Assume secrets exist; prefer deny rules for `.env*`, `secrets/**`, credentials paths if absent. -- Treat hooks and MCP as high-leverage + higher-risk; keep them minimal and well explained. -- Never recommend putting credentials in repo memory files. - ---- diff --git a/.claude/docs/claude-code-steering-guide.md b/.claude/docs/claude-code-steering-guide.md deleted file mode 100644 index 5a79c6a..0000000 --- a/.claude/docs/claude-code-steering-guide.md +++ /dev/null @@ -1,931 +0,0 @@ -## 1) The mental model: 3 steering layers - -Claude Code behavior is shaped by three complementary mechanisms: - -1. **Instruction text** (persistent or conditional) - - -- `CLAUDE*.md` memory and `.claude/rules/*.md` rules become _system-level instruction text_ for the session. - - Claude Code Artifacts Guide - - -2. **Declarative policy/config** - - -- `settings.json` controls permissions, sandboxing, hooks wiring, plugins, model defaults, etc. This is the deterministic enforcement layer (block/allow/ask). - - Claude Code Repo Artifacts - Co… - - -3. **Executable policy / dynamic context** - - -- **Hooks** run scripts on events (before/after tools, session start, prompt submit, stop, etc.), can **block actions**, and can inject **additionalContext** dynamically. - - Claude Code Repo Artifacts - Co… - - -A practical takeaway: **Use instruction text for “how to do work”, settings for “what is allowed”, hooks for “guardrails + automation + dynamic context”.** - -Claude Code Repo Artifacts - Co… - ---- - -## 2) Lifecycle: from disk → “what the model sees” - -On session start, Claude Code roughly does: - -1. Merge settings by precedence (managed → CLI → local → project → user). - - Claude Code Repo Artifacts - Co… - -2. Load memory & rules: - - - `CLAUDE*.md` from relevant scopes, plus `.claude/rules/*.md` and `~/.claude/rules/*.md` (path-scoped rules activate only when relevant files are in play). - - Claude Code Repo Artifacts - Co… - -3. Discover subagents (`.claude/agents`, `~/.claude/agents`, plugins, CLI `--agents`). - - Claude Code Repo Artifacts - Co… - -4. Discover Skills (initially **only name + description**, bodies lazy-loaded). - - Claude Code Repo Artifacts - Co… - -5. Discover slash commands (`.claude/commands`, `~/.claude/commands`, plugin commands). - - Claude Code Repo Artifacts - Co… - -6. Register hooks from settings + component frontmatter + plugins. - - Claude Code Repo Artifacts - Co… - -7. Apply permissions & permission mode (IAM) rules (allow/ask/deny, sandboxing, etc.). - - Claude Code Repo Artifacts - Co… - - -For each model invocation, “context” is assembled from: - -- Internal system prompt (not published / not editable) - - Claude Code Artifacts Guide - -- Memory text + applicable rules - - Claude Code Artifacts Guide - -- Any active Skill bodies or subagent prompts - - Claude Code Artifacts Guide - -- Conversation history (possibly compacted) - - Claude Code Artifacts Guide - -- Selected file contents and tool outputs, plus hook-injected `additionalContext` - - Claude Code Repo Artifacts - Co… - - ---- - -## 3) Memory & rules: always-on vs conditional instruction text - -### 3.1 What counts as “memory” - -Memory is plain Markdown instruction text that becomes system-level guidance: - -**Scopes & typical locations** (conceptual precedence: org → project → local → user): - -Claude Code Artifacts Guide - -- Org/managed policies (if your org uses managed configuration). - -- Project memory: `CLAUDE.md` committed to repo (often repo root; `.claude/CLAUDE.md` is also supported). - - Claude Code Artifacts Guide - -- Project local memory: `CLAUDE.local.md` (gitignored; often root or `.claude/`). - - Claude Code Artifacts Guide - -- User global memory: `~/.claude/CLAUDE.md` (applies to all projects). - - Claude Code Artifacts Guide - - -**Discovery behavior (important):** - -- Starting from your current working directory, Claude Code searches upward for `CLAUDE.md` / `CLAUDE.local.md` up to repo root, and loads all it finds on that upward path. - - Claude Code Artifacts Guide - -- It can also _detect_ deeper child-directory `CLAUDE.md` files but will **lazy-load** them only when you start working in those subtrees. - - Claude Code Artifacts Guide - - Claude Code Repo Artifacts - Co… - - -**Conflict handling:** there’s no formal override system beyond “the model reads text”; avoid contradictory instructions. More specific/project text tends to win in practice because of how it’s injected later in the composite prompt. - -Claude Code Artifacts Guide - -### 3.2 Rules files: `.claude/rules/*.md` (+ user rules) - -Rules are also Markdown instructions, but can be **path-scoped**. - -- Project rules: `.claude/rules/**/*.md` (recursive). - - Claude Code Artifacts Guide - -- User rules: `~/.claude/rules/**/*.md` (recursive), loaded before project rules (so project can override/extend). - - Claude Code Artifacts Guide - -- Rules are “same priority” as project memory; they’re mainly for organization + conditional scoping. - - Claude Code Artifacts Guide - - -**Frontmatter schema for rules (`paths` only):** - -Claude Code Repo Artifacts - Co… - -- `paths` is a list of glob patterns; rule applies only when Claude is working with matching files. - -- Globs support `**`, `*`, and brace expansion: - - - `**/*.ts`, `src/**/*`, `src/**/*.{ts,tsx}`, `{src,lib}/**/*.ts` - - Claude Code Artifacts Guide - -- No `paths` → rule is global/always applicable. - - Claude Code Artifacts Guide - - -**Optimization detail:** path-scoped rules may be excluded from immediate context until relevant (exact filtering not fully documented), which helps reduce context bloat. - -Claude Code Artifacts Guide - -### 3.3 `@` imports inside memory/rules (static include) - -Inside `CLAUDE*.md` or rules you can include other files via `@path`. This is a _preprocessing include_ that inserts the referenced file content into memory context. - -Claude Code Artifacts Guide - -Key mechanics: - -Claude Code Artifacts Guide - -- Relative + absolute paths supported; can import from home (e.g. `@~/.claude/...`). - -- Imports can be nested, but **max depth is 5** (prevents recursion). - -- Imports inside inline code / code blocks are ignored (so `` `@file` `` won’t import). - - -**Practical guidance:** keep `CLAUDE.md` concise and import only what must be “always on”. Put large references in separate files and link/import strategically. - -Claude Code Repo Artifacts - Co… - ---- - -## 4) Settings (`settings.json`): deterministic control of tools, scope, safety - -`settings.json` is the declarative configuration layer for: - -- permissions / IAM defaults (allow/ask/deny, modes) - -- sandboxing for Bash - -- hooks - -- environment variables for tool runs - -- plugins / marketplaces - -- model-ish defaults, status line, file suggestions, “thinking” toggles, etc. - - Claude Code Repo Artifacts - Co… - - -### 4.1 Locations & precedence - -Canonical locations and precedence: **Managed > CLI args > Local > Project > User**. - -Claude Code Repo Artifacts - Co… - -- Managed: OS-specific paths (highest precedence; enterprise policy). - - Claude Code Repo Artifacts - Co… - -- User: `~/.claude/settings.json` - - Claude Code Repo Artifacts - Co… - -- Project: `.claude/settings.json` - - Claude Code Repo Artifacts - Co… - -- Local project: `.claude/settings.local.json` (personal override, gitignored) - - Claude Code Repo Artifacts - Co… - - -### 4.2 Permissions structure: allow / ask / deny - -Permissions rules decide what Claude can do without prompting, what always requires confirmation, and what is blocked: - -- `allow`: auto-approved tool patterns (notably, **Bash patterns are prefix matches, not regex**). - - Claude Code Repo Artifacts - Co… - -- `ask`: always require confirmation; **ask overrides allow** if both match. - - Claude Code Repo Artifacts - Co… - -- `deny`: hard blocks; **deny wins over allow/ask**. - - Claude Code Repo Artifacts - Co… - -- `additionalDirectories`: expands Claude’s accessible workspace beyond the working directory (useful in monorepos). - - Claude Code Repo Artifacts - Co… - - -**Security posture tip:** deny reads of `.env` and secret directories to prevent leaking sensitive data into model context. - -Claude Code Repo Artifacts - Co… - -### 4.3 Default permission modes (IAM) - -`defaultMode` sets the baseline permission behavior: - -Claude Code Repo Artifacts - Co… - -- `default`: prompt on first use of each tool - -- `acceptEdits`: auto-accept file edits/filesystem ops; still prompts for other tools - -- `plan`: analyze-only; no modifications or command execution - - Claude Code Repo Artifacts - Co… - -- `dontAsk`: auto-deny tools unless pre-approved (very restrictive) - - Claude Code Repo Artifacts - Co… - -- `bypassPermissions`: no prompts; only safe in strong sandbox environments - - Claude Code Repo Artifacts - Co… - - -Managed settings can disable bypass mode (`disableBypassPermissionsMode`) and block the CLI flag for skipping permissions. - -Claude Code Repo Artifacts - Co… - -### 4.4 Sandboxing (Bash execution environment) - -Settings can enforce Bash sandboxing, with options like `autoAllowBashIfSandboxed`, `excludedCommands`, and network constraints. - -Claude Code Repo Artifacts - Co… - -Important nuance: sandboxing controls **where** Bash runs, but filesystem/network restrictions still fundamentally depend on the permission system (`Read`, `Edit`, `WebFetch`, etc.). - -Claude Code Repo Artifacts - Co… - -### 4.5 Other notable settings keys that influence behavior/context - -Commonly relevant keys include: - -Claude Code Repo Artifacts - Co… - -- `model` (default model) - -- `statusLine` (can show context usage) - -- `fileSuggestion` (customizes `@`-autocomplete discovery) - -- `alwaysThinkingEnabled` / `MAX_THINKING_TOKENS` - -- `enabledPlugins`, `extraKnownMarketplaces`, `strictKnownMarketplaces` (plugin sources + enablement) - - Claude Code Artifacts Guide - -- `env` (env vars applied to Bash tool executions; handle secrets carefully) - - Claude Code Artifacts Guide - -- `cleanupPeriodDays` (session cleanup) - - Claude Code Artifacts Guide - -- `companyAnnouncements`, telemetry/metrics toggles (less direct steering) - - Claude Code Artifacts Guide - - ---- - -## 5) Hooks: executable guardrails + dynamic context injection - -Hooks can: - -- run commands before/after tools - -- block tool executions - -- mutate tool input - -- inject additional context at key points - -- shape stopping/resume behavior - - -They are a major “automation and policy” surface. - -Claude Code Repo Artifacts - Co… - -### 5.1 Where hooks can be defined - -- In `settings.json` (`"hooks": { ... }`) - - Claude Code Artifacts Guide - -- In Skill frontmatter (active only during Skill execution) - - Claude Code Artifacts Guide - -- In subagent frontmatter (agent-scoped) - - Claude Code Repo Artifacts - Co… - -- In command frontmatter (command-scoped) - - Claude Code Repo Artifacts - Co… - -- In plugins (e.g., `hooks.json`) - - Claude Code Repo Artifacts - Co… - - -### 5.2 Blocking semantics & control via exit codes - -Exit code behavior: - -Claude Code Repo Artifacts - Co… - -- `0`: success; stdout may be consumed (and for some events, JSON in stdout can control behavior) - -- `2`: **blocking error** (action blocked; stderr becomes error message; stdout JSON ignored) - -- other non-zero: non-blocking error (primarily for logs/debug) - - -### 5.3 JSON control protocol (stdout) - -Hooks can emit structured JSON to influence Claude Code behavior: - -Claude Code Repo Artifacts - Co… - - -Common fields include: - -- `continue` (false stops further processing) - -- `stopReason` - -- `suppressOutput` - -- `systemMessage` - -- `hookSpecificOutput` (event-specific payload) - - -Event-specific examples: - -Claude Code Repo Artifacts - Co… - -- **PreToolUse**: set `permissionDecision` (`allow|deny|ask`), mutate `updatedInput`, inject `additionalContext` - -- **PermissionRequest**: allow/deny + updatedInput + message; can interrupt - -- **PostToolUse**: can block (decision) or inject context after completion - -- **UserPromptSubmit**: block prompt or inject context - -- **Stop / SubagentStop**: block stop + provide `reason` to guide continuation - -- **SessionStart**: inject initial `additionalContext` - - -### 5.4 High-value hook patterns - -- **Policy enforcement**: deny dangerous commands, enforce “no secrets”, prevent writes to protected paths. - - Claude Code Repo Artifacts - Co… - -- **Quality automation**: auto-format/lint after edits, run unit tests after writes. - - Claude Code Artifacts Guide - -- **Dynamic context**: inject `git diff`, open tickets, branch status at SessionStart or before tools. - - Claude Code Repo Artifacts - Co… - -- **Compaction support**: run logic before compaction using a PreCompact hook. - - Claude Code Artifacts Guide - - ---- - -## 6) Skills: auto-discovered, reusable “capabilities” with progressive disclosure - -Skills are best for **automatic** or “always available” expertise and workflows that should trigger from intent keywords. - -### 6.1 Discovery → activation → execution - -- At startup, only **name + description** for each Skill are loaded for discovery. - - Claude Code Repo Artifacts - Co… - -- When a request matches, Claude proposes using the Skill (often requiring confirmation). - - Claude Code Artifacts Guide - -- On activation, the full `SKILL.md` is loaded into context for that run, plus any on-demand linked files/scripts as needed. - - Claude Code Artifacts Guide - - -### 6.2 Progressive disclosure inside Skills - -Skills can keep their core file lean and link out to supporting docs/scripts: - -Claude Code Artifacts Guide - -- Markdown links like `[Reference](reference.md)` are discovered; linked docs are typically loaded **on demand**. - -- Default behavior is often “one hop” of link-following; keep critical content one link away. - -- Scripts (e.g., `scripts/validate_form.py`) can be executed via Bash; only output enters context (saves tokens). - -- Recommendation: keep `SKILL.md` under ~500 lines; move bulk into linked files. - - Claude Code Artifacts Guide - - -### 6.3 Skill schema (`SKILL.md` frontmatter) - -Officially documented fields include: - -Claude Code Repo Artifacts - Co… - -- `name` (required; lowercase letters/numbers/hyphens; ≤64 chars) - -- `description` (required; ≤1024 chars) - -- `allowed-tools` (allowlist; tools usable without asking while Skill active) - -- `model` (override model while active) - -- `context: fork` (run Skill in separate subagent context) - - Claude Code Artifacts Guide - -- `agent` (when forking, choose agent profile) - - Claude Code Artifacts Guide - -- `hooks` (skill-scoped; PreToolUse/PostToolUse/Stop; supports `once: true`) - - Claude Code Artifacts Guide - -- `user-invocable` (default true; if false, hides from slash menu but still eligible for auto-discovery) - - Claude Code Artifacts Guide - -- `disable-model-invocation` (blocks programmatic invocation via Skill tool; auto-discovery may still exist depending on environment) - - Claude Code Repo Artifacts - Co… - - -**Visibility/control matrix (important):** - -Claude Code Repo Artifacts - Co… - -- default (`user-invocable: true`): visible in slash menu; Skill tool allowed; auto-discovery yes - -- `user-invocable: false`: hidden; Skill tool allowed; auto-discovery yes - -- `disable-model-invocation: true`: visible; Skill tool blocked; auto-discovery yes (unless further restricted) - - -### 6.4 Where Skills can be discovered (including monorepos) - -- Skills exist in `.claude/skills/` (project), `~/.claude/skills/` (user), plugin packages, and can also be discovered from nested `.claude/skills/` directories in monorepos near the active file path. - - Claude Code Repo Artifacts - Co… - - ---- - -## 7) Slash commands: explicit prompt templates (+ pre-executed context) - -Slash commands are best for “do this exact flow now” — one-shot or repeatable user-triggered procedures. - -### 7.1 Locations & scope - -- Project: `.claude/commands/*.md` - - Claude Code Repo Artifacts - Co… - -- User: `~/.claude/commands/*.md` - - Claude Code Repo Artifacts - Co… - -- Plugin commands: namespaced like `/pluginName:commandName`. - - Claude Code Artifacts Guide - - -Name collisions between user and project commands are not cleanly specified; avoid duplicates. - -Claude Code Artifacts Guide - -### 7.2 Command file schema (documented) - -From the Agent SDK slash-commands doc, frontmatter fields include: - -Claude Code Repo Artifacts - Co… - -- `description` (recommended; used in help and for metadata exposure) - -- `allowed-tools` (temporary allowlist while running the command) - -- `model` (model override for this command) - -- `argument-hint` (help text for args) - -- `hooks` (command-scoped hooks; same schema; supports `once: true`) - - -### 7.3 Body syntax (power features) - -Slash command bodies can include: - -Claude Code Repo Artifacts - Co… - -- **Bash pre-execution:** `!` + backticked shell command, e.g. `!`git diff`` - - - Claude Code executes these _before_ prompting the model and inserts outputs into context. - -- **File references:** `@path/to/file` to inject file contents into context for that command run. - - Claude Code Repo Artifacts - Co… - -- **Arguments templating:** `$ARGUMENTS` or `$1`, `$2`, … substituted from `/command arg1 arg2`. - - Claude Code Artifacts Guide - - - If you don’t use placeholders, args are appended as `ARGUMENTS: ...` by default. - - Claude Code Artifacts Guide - - -### 7.4 Model-initiated invocation + metadata budget - -- Commands (and Skills) are surfaced to the model via an internal “tool listing” mechanism; there’s a **~15k character budget** for names/descriptions/metadata. If exceeded, only a subset is included; `/context` warns. - - Claude Code Artifacts Guide - -- You can adjust the budget via `SLASH_COMMAND_TOOL_CHAR_BUDGET`. - - Claude Code Artifacts Guide - - -### 7.5 Restricting whether the model can invoke commands - -There are two relevant control planes: - -1. **Deterministic enforcement via permissions** - - -- You can globally deny the Skill-like invocation mechanism by denying the `Skill` pseudo-tool in settings (prevents model-initiated invocation of commands/skills through that mechanism). - - Claude Code Artifacts Guide - - -2. **Per-command metadata suppression** - - -- Some environments support `disable-model-invocation: true` for commands (intended to keep it user-only by removing it from model-visible listings). - - Claude Code Artifacts Guide - -- However, official SDK documentation does **not** list `disable-model-invocation` for commands (it _is_ a Skill field). Treat command support for this as **environment/version-dependent** and verify in your build. - - Claude Code Repo Artifacts - Co… - - -### 7.6 Forked execution for commands (verify in your build) - -Some guides describe `context: fork` (and `agent`) for commands to run them in an isolated subagent context (like Skills). - -Claude Code Artifacts Guide - - -Official slash-command docs emphasize the core schema above and do not clearly standardize `context/agent` for commands; treat this as **not guaranteed** unless you’ve confirmed your Claude Code version supports it. - -Claude Code Repo Artifacts - Co… - ---- - -## 8) Subagents: specialized Claude instances with separate context + constraints - -Subagents are for compartmentalized work (exploration, reviews, risky tasks with stricter permissions, etc.). - -### 8.1 Purpose & key properties - -- Each subagent has its own system prompt, tool set, permission mode, optional preloaded Skills, and a **separate context window** with its own compaction lifecycle. - - Claude Code Repo Artifacts - Co… - -- Subagents help keep the main conversation clean: only the result/summary returns to the main context, not the entire subagent transcript. - - Claude Code Artifacts Guide - -- Subagents cannot spawn other subagents (no nesting). - - Claude Code Artifacts Guide - - -### 8.2 Locations & discovery precedence - -Precedence: CLI-defined > project > user > plugin. - -Claude Code Repo Artifacts - Co… - -- `.claude/agents/*.md` (project) - - Claude Code Artifacts Guide - -- `~/.claude/agents/*.md` (user) - - Claude Code Artifacts Guide - -- CLI `--agents '{...}'` (session only) - - Claude Code Repo Artifacts - Co… - -- Plugins can package agents. - - Claude Code Repo Artifacts - Co… - - -Agents are listed via `/agents`; edits may require restart or reload. - -Claude Code Repo Artifacts - Co… - -### 8.3 Subagent file schema (YAML frontmatter) - -Required: `name`, `description`. - -Claude Code Artifacts Guide - - -Common fields: - -Claude Code Repo Artifacts - Co… - -- `tools` (allowlist; if omitted, inherits all tools from main) - -- `disallowedTools` (denylist applied atop inherited/specified tools) - -- `model` (`sonnet`, `opus`, `haiku`, `inherit`) - -- `permissionMode` (`default`, `acceptEdits`, `dontAsk`, `bypassPermissions`, `plan`) - -- `skills` (Skills to inject into subagent at startup; subagents don’t inherit Skills by default) - -- `hooks` (agent-scoped hooks; supports PreToolUse/PostToolUse/Stop) - - -The markdown body after frontmatter is the subagent’s system prompt; internal main prompt isn’t shared to subagents. - -Claude Code Repo Artifacts - Co… - -### 8.4 Context management, compaction, resuming - -- Subagents compact independently from the main conversation; compaction events are logged in transcripts. - - Claude Code Artifacts Guide - -- You can resume a prior subagent run; Claude Code tracks agent IDs and stores logs under something like `~/.claude/projects/{project}/{sessionId}/subagents/agent-.jsonl`. - - Claude Code Artifacts Guide - -- Subagent state can persist across session restarts if you reopen the same session; old sessions may be cleaned after a retention period (e.g., controlled by `cleanupPeriodDays`). - - Claude Code Artifacts Guide - - ---- - -## 9) File referencing: three different mechanisms (don’t mix them up) - -### 9.1 `@` in memory/rules = static include (preprocessing) - -- Happens when memory file is loaded; literally inserts file text into memory context at that point. Not “dynamic per prompt”. - - Claude Code Artifacts Guide - -- Depth limit 5; ignores code blocks/inline code. - - Claude Code Artifacts Guide - - -### 9.2 `@` in prompts/commands = on-demand file injection - -- Typing `@` in Claude Code input triggers a file picker; selecting inserts a reference that causes Claude to read file contents into context. - - Claude Code Repo Artifacts - Co… - -- In commands, `@file` can be included directly in the body to pull file contents into that command run’s context. - - Claude Code Repo Artifacts - Co… - - -### 9.3 Skill links = progressive disclosure - -- Markdown links in `SKILL.md` advertise supporting files; Claude loads them only if needed. One-hop default is common. - - Claude Code Artifacts Guide - - -### 9.4 `!` backticks in commands = pre-executed Bash → output to context - -- The command body can run shell snippets `!`like this``; outputs become part of the prompt context. - - Claude Code Repo Artifacts - Co… - - ---- - -## 10) Context limits, compaction, and budgets - -Claude models have finite context windows; Claude Code manages this via lazy-loading and compaction. - -Claude Code Artifacts Guide - -Key behaviors: - -Claude Code Artifacts Guide - -- **Auto-compaction** summarizes older history when usage crosses a threshold; logged in transcripts. - -- `/compact` can be user-invoked (optionally with instructions about what to preserve). - -- A **PreCompact hook** exists for “do something before compaction.” - -- Skill bodies load only when activated; subagent work stays isolated. - - Claude Code Artifacts Guide - - -**Metadata budget for commands/skills list:** ~15k characters (names/descriptions); can be tuned via `SLASH_COMMAND_TOOL_CHAR_BUDGET`. - -Claude Code Artifacts Guide - -**Tool-search heuristic (behavioral nuance):** - -- There is an internal “tool search” tendency that may proactively fetch project files when context usage is low (e.g., <10% usage), but will be more conservative as context fills. - - Claude Code Artifacts Guide - - ---- - -## 11) Plugins & MCP: extending tool surface (and therefore context) - -### 11.1 Plugins - -- Plugins can bring their own Skills, agents, commands, and hooks; plugin commands are namespaced (e.g. `/pluginName:commandName`). - - Claude Code Artifacts Guide - -- Plugin enablement and marketplaces are controlled via settings keys like `enabledPlugins`, `extraKnownMarketplaces`, `strictKnownMarketplaces`. - - Claude Code Repo Artifacts - Co… - - -### 11.2 MCP (Model Context Protocol) - -- MCP servers add external tools (doc stores, databases, APIs), expanding what Claude can do and what content can be pulled into context. - - Claude Code Repo Artifacts - Co… - -- MCP state/preferences and project MCP servers are associated with `~/.claude.json` and `.mcp.json`. - - Claude Code Repo Artifacts - Co… - -- MCP tools can pull in large content when invoked; some MCP-generated slash commands are dynamically discovered (and may not persist unless reconnected, depending on setup). - - Claude Code Artifacts Guide - - ---- - -## 12) Recommended repo layout (high-signal, scalable, low-bloat) - -A strong baseline pattern: - -`your-repo/ CLAUDE.md # short, high-signal project guidance CLAUDE.local.md # personal overrides (gitignored) .claude/ settings.json # shared project policy + hooks wiring settings.local.json # personal override (gitignored) rules/ testing.md # always-on testing rules backend.md # path-scoped backend rules security.md # safe-by-default rules commands/ review.md # git-diff driven review flow commit.md # conventional commit helper skills/ explaining-code/ SKILL.md reference.md # large doc, linked from SKILL.md scripts/ # heavy logic run via Bash agents/ safe-researcher.md # read-only exploration agent code-reviewer.md hooks/ check-style.sh validate-command.sh` - -Why this works: - -- `CLAUDE.md` stays lean; large docs are imported/linked only when needed. - - Claude Code Repo Artifacts - Co… - -- Rules allow language/subsystem-specific guidance without polluting all sessions. - - Claude Code Artifacts Guide - -- Settings enforce safety and prevent instruction drift. - - Claude Code Artifacts Guide - -- Hooks make policies executable and consistent. - - Claude Code Repo Artifacts - Co… - -- Skills provide reusable auto-capabilities; commands provide explicit repeatable flows. - - Claude Code Artifacts Guide - - ---- - -## 13) Practical steering recipes (copy/paste patterns) - -### A) “Safe-by-default, productive” permissions - -- Deny `.env` and secrets paths, allow common read/search tools, make Bash sandboxed. - - Claude Code Repo Artifacts - Co… - - Claude Code Repo Artifacts - Co… - - -### B) Auto-quality gates - -- PostToolUse hook on `Edit`/`Write` runs formatter/linter; exit code 2 blocks if violations. - - Claude Code Artifacts Guide - - Claude Code Repo Artifacts - Co… - - -### C) Git-diff review command - -- Slash command body pre-runs `git diff` via `!` backticks and feeds output to a structured review rubric. - - Claude Code Repo Artifacts - Co… - - -### D) Keep heavy analysis out of main context - -- Use subagents for log spelunking, repo-wide scanning; main chat receives only result summary. - - Claude Code Artifacts Guide - - -### E) Skills for “always follow our standards” - -- Put evergreen standards (review checklist, architecture heuristics) into Skills or rules, not into ad-hoc prompts. - - Claude Code Artifacts Guide - - Claude Code Artifacts Guide - - ---- - -## 14) Known “gray areas” you should treat as version-dependent - -These are important because they affect how you design your steering stack: - -- **Slash command frontmatter beyond the SDK schema** (notably `disable-model-invocation`, `context`, `agent`) may exist in some environments and guides, but is not consistently specified as part of the official slash-commands frontmatter spec. The safest, most portable approach is to enforce behavior via **permissions** (deny Skill pseudo-tool / restrict tool access) and use hooks. - - Claude Code Repo Artifacts - Co… - - Claude Code Artifacts Guide - - Claude Code Artifacts Guide \ No newline at end of file diff --git a/.claude/skills/branch/SKILL.md b/.claude/skills/branch/SKILL.md deleted file mode 100644 index 844d64a..0000000 --- a/.claude/skills/branch/SKILL.md +++ /dev/null @@ -1,137 +0,0 @@ ---- -name: branch -description: Autonomous branching strategy — evaluate fit, create, continue, or escalate -allowed-tools: Bash ---- - -Manage branch strategy autonomously. Make routine decisions silently; only escalate to the human when genuinely uncertain. - -**Task context**: $ARGUMENTS -(May contain: task description, theme, or area of work) - ---- - -## Phase 1: Gather Context - -### Step 1: Gather Branch Landscape - -```bash -bash scripts/branch-context.sh -``` - -Parse the structured output: `cleaned-up`, `branch`, `clean`, `status`, `pushed`, `commit-count`, `commits`, `files-changed`, `wip-branches`. - -**If dirty working tree** (`clean: no`): STOP. Ask the user what to do (commit or stash). Never make branching decisions with uncommitted changes. - -### Step 2: Gather Task Theme - -**Context contract** — to make a branching decision, the agent needs: (1) the branch landscape (from step 1) and (2) the task theme. The task theme is where the tiers apply. Stop at the earliest tier that fulfills the contract. - -**Tier 1 — Mechanical**: `$ARGUMENTS` provides the task theme directly. Contract complete. Proceed to Phase 2. - -**Tier 2 — Agent reasoning**: No `$ARGUMENTS` (empty or literal `"$ARGUMENTS"`). Derive the task theme from what's already available, in this order: - -1. **Branch landscape** — if WIP branches exist, their names, commit subjects, and changed files suggest the thematic area. If there's only one WIP branch, it's likely where work should continue. -2. **Conversation context** — what the human said earlier in this session may indicate what they want to work on. -3. **Task tracking** — `.agent/task-tracking.md` lists the current priority queue, which may indicate the next task's theme. - -If the agent can confidently determine the theme from these sources, the contract is complete. Proceed to Phase 2. - -**Tier 3 — Human input**: The agent has partial information but isn't confident. Present what it knows — "I see [WIP branch X] with commits about [topic], and the next task in the queue is [Y]" — and ask specifically for what's missing. Do NOT create a branch without knowing the theme. - ---- - -## Phase 2: Decide and Act - -Only proceed once the contract is complete — branch landscape + task theme are both known. - -### Step 3: On Main - -If `status: on-main`: - -#### No WIP branches, has task theme - -Create a thematic branch silently. - -1. Update main first: - ```bash - git pull origin main - ``` -2. Choose a branch name: - - **Broad enough** to accommodate related follow-up tasks — name the chapter, not one task - - Prefixes: `feat/`, `fix/`, `docs/`, `chore/` - - Kebab-case, descriptive of the thematic area - - Good: `feat/ai-steering-enhancements`, `fix/schema-validation-edge-cases` - - Bad: `feat/cra-create-analyze` (too narrow, one task ID) -3. Create: - ```bash - git checkout -b - ``` -4. Report: "Created branch ``" - -#### One WIP branch fits task theme - -The WIP branch's name, commits, and changed files align with the task theme. Switch to it silently: - -```bash -git checkout -``` - -Report: "Switched to `` — task fits the existing work." - -#### One WIP branch doesn't fit task theme - -The WIP branch exists but covers a different area. Escalate: - -"There's an existing branch `` with [N] commits about [topic]. The new task is about [new theme]. I recommend running `/pr` to close that branch first, then starting fresh on main. New branches should always start from the latest main." - -#### Multiple WIP branches - -Escalate — present the landscape and let the human decide: - -"There are multiple WIP branches: -- ``: [N] commits about [topic] -- ``: [N] commits about [topic] - -Which branch should we continue on, or should we `/pr` one or more of them first?" - -### Step 4: On Feature Branch - -If `status: on-feature`: - -Compare the task theme against the branch's theme (name + existing commits + files changed). - -**Clearly fits** — the task is in the same thematic area: -→ Continue silently. Report: "Continuing on `` — task fits the current theme." - -**Branch name is too narrow** — the task fits thematically but the branch name is task-specific rather than thematic: -→ Check if branch is pushed to remote. -- **Not pushed**: Rename automatically: - ```bash - git branch -m - ``` - Report: "Renamed branch to `` to better reflect the scope." -- **Already pushed**: Keep current name. Report that the name is narrower than ideal but renaming a pushed branch is disruptive. - -**Clearly doesn't fit** — different topic, unrelated area, or branch already has 5+ commits: -→ Escalate to human with specific guidance: -"This branch covers [current theme — based on name + commits]. The new task is about [new task theme]. I recommend running `/pr` to close this branch first, then starting fresh on main." - -**Genuinely ambiguous** — could go either way: -→ Ask the human. Present what's on the branch and what the new task is. Let them decide. - ---- - -## Decision Summary - -| Situation | Action | -|-----------|--------| -| On main, no WIP branches, has theme | Create branch silently | -| On main, one WIP branch fits theme | Switch to it silently | -| On main, one WIP branch doesn't fit | Escalate: recommend `/pr` first | -| On main, multiple WIP branches | Escalate: present landscape, let human decide | -| On feature, task fits | Continue silently | -| On feature, name too narrow (unpushed) | Auto-rename, continue | -| On feature, name too narrow (pushed) | Note it, continue | -| On feature, doesn't fit | Recommend `/pr` first | -| On feature, ambiguous | Ask the human | diff --git a/.claude/skills/close/SKILL.md b/.claude/skills/close/SKILL.md deleted file mode 100644 index 66d0eac..0000000 --- a/.claude/skills/close/SKILL.md +++ /dev/null @@ -1,310 +0,0 @@ ---- -name: close -description: Finalize an atomic step — distill context, commit, and (on final step) update tracking and suggest next -allowed-tools: Read, Bash, Edit, Write, Glob, Skill ---- - -Finalize a completed atomic step. May be called **multiple times per session** — once per step. On the final step, also updates tracking and suggests next steps. - -**Universal session close capability**: Works for task workflow sessions, exploration sessions, mid-task pauses, and handles idempotency. Detects session state and takes appropriate action. - -**Context**: $ARGUMENTS -(May contain: summary of what was implemented and why, passed from the conversational workflow) -(May also contain mode hints: "final", "step", "pause" to guide session state detection) - -## 0. Detect Session State - -Before proceeding, determine what kind of session this is by checking **observable artifacts** (files, git state) rather than conversation history. Skills don't have access to prior conversation context, so we rely on external state. - -### Check 1: Is the working tree clean? - -```bash -git status --porcelain -``` - -**If empty output** (working tree clean): -→ **Likely already closed** - Proceed to Step 0e to confirm - -**If has output** (changes exist): -→ Continue to Check 2 - -### Check 2: Is there task context? - -Check for task workflow artifacts: - -```bash -# Check current branch first -git branch --show-current -``` - -**If branch is main**: -→ Not actively working on a task (either task complete or exploration) → Continue to Check 3 - -**If branch is NOT main** (feature/task branch): -→ Check for task tracking: - -```bash -test -f .agent/task-tracking.md && grep -A 5 "Priority Queue" .agent/task-tracking.md -``` - -**If task-tracking exists**: -→ This is likely a **task workflow session** → Continue to Step 1 (normal flow) - -**If no task-tracking**: -→ Continue to Check 3 - -### Check 3: Is there uncommitted work? - -From Check 1, we know if there are changes. Now determine context: - -**If changes exist but no task context found**: -→ **Route to: Dirty Tree (Step 0d)** - Changes without clear task context - -**If no changes and no task context**: -→ **Route to: Exploration Session (Step 0c)** - Session without committed work - ---- - -## 0c. Exploration Session (No Task Context) - -This session wasn't working on a tracked task from `.agent/task-tracking.md`. This is an exploration, research, or question-answering session. - -Ask user: "This session wasn't working on a tracked task. Would you like to save any notes or findings from this session?" - -**If yes**: -- Ask: "Where should I save these notes?" - 1. New file in `.agent/input/session-notes-YYYY-MM-DD.md` - 2. Append to an existing backlog file (ask which one) - 3. Just remember for next session (I'll mention it) -- Write notes as requested, include date and brief session summary - -**If no**: -- Confirm: "No notes saved. Session closed cleanly. You can start a new session anytime." - -**Check working tree**: If there are uncommitted changes, offer: -"You have uncommitted changes. Would you like to commit them before closing? I'll use Tier 3 escalation to gather context." - -Exit skill after handling. No tracking update needed. - ---- - -## 0d. Dirty Tree with No Clear Context - -You have staged or unstaged changes, but no clear task context from TodoWrite or task-tracking. - -Present options to user: - -**Options:** -1. **Commit now** → I'll escalate to Tier 3 (ask for commit message context) and create a commit -2. **Stash for later** → `git stash push -m "Session paused YYYY-MM-DD"` -3. **Continue working** → Don't close yet, keep this session open -4. **Discard changes** → `git restore .` (warning: this is destructive) - -Wait for user choice, then execute: - -**Choice 1 (Commit)**: -- Use AskUserQuestion tool: "Please provide context for this commit: What was the goal? What approach was taken? What was tested?" -- Once user provides context, invoke: `Skill(skill="commit", args="")` -- Confirm commit created, session closed - -**Choice 2 (Stash)**: -```bash -git stash push -m "Session paused $(date +%Y-%m-%d)" -``` -- Confirm: "Changes stashed. Session closed. Run `git stash pop` to resume." - -**Choice 3 (Continue)**: -- Confirm: "Session still open. Call `/close` again when ready to close." -- Exit skill, don't close session - -**Choice 4 (Discard)**: -```bash -git restore . -git clean -fd -``` -- Confirm: "Changes discarded. Session closed." - -Exit skill after handling. No tracking update needed. - ---- - -## 0e. Likely Already Closed (Clean Working Tree) - -The working tree is clean (no uncommitted changes). This suggests either: -1. The session was already closed with `/close` -2. No work was done this session -3. All work was committed manually - -**Check branch and task context**: -```bash -git branch --show-current -test -f .agent/task-tracking.md && echo "Task tracking exists" || echo "No task tracking" -``` - -**If on main branch with clean tree**: -- Either task was completed and merged/PR'd, or no task was active -- Confirm: "Working tree is clean and on main branch. Session appears complete. Safe to start a new session." -- Exit skill - -**If on feature branch with task tracking**: -- This might be mid-task (just committed a step) -- Or task is complete -- Ask user: "Working tree is clean. Is this task complete, or do you want to continue working?" - - **Complete**: Continue to Step 1 (final close flow) - - **Continue**: Exit with "Session remains open. Call `/close` when ready." - -**If no task tracking**: -- Confirm: "Session appears complete. Working tree is clean, no task in progress. Safe to start a new session." -- Exit skill - ---- - -## 1. Identify the Task - -```bash -git branch --show-current -``` - -Read `.agent/task-tracking.md` to identify which task is being worked on (match by branch theme or current priority). -Read the backlog file for the task title and ID. - -## 2. Distill Context - -Before committing, review what happened during **this step** and construct a summary. This is a reasoning step — derive the summary from the session's actual work, not generic filler. Each step gets its own context; don't bundle reasoning from earlier steps. - -The summary must cover: -- **Goal**: What was this step trying to achieve? -- **Approach and reasoning**: What approach was taken and why? -- **Alternatives considered**: What was rejected and why? -- **What was tested**: Which scenarios were verified? -- **Risks**: What could go wrong or what's left unaddressed? - -This doesn't need to be exhaustive — a few sentences covering the key points. But it must be *real*, derived from the actual session. - -**Example** — what a good summary looks like: -``` -Added LRU caching to KnowledgeLoader because loading evaluation criteria -from disk on every preflight call was adding ~200ms. Chose LRU over TTL -because the data files are static within a session. Tested with empty cache, -full cache, and cache invalidation on file change. Risk: cache is not -invalidated if data files are edited mid-session, but this is acceptable -since data files only change between releases. -``` - -Use `$ARGUMENTS` if provided — it may already contain this context. If `$ARGUMENTS` is empty or thin, derive the summary from `git diff --staged` or `git diff main..HEAD` and the session conversation. - -## 3. Verify Quality (Optional but Recommended) - -If you modified implementation code (not just docs/tests), consider running relevant tests before committing. This catches test failures earlier than the `/pr` quality gate. - -**When to run tests**: -- ✅ Modified implementation in `adr_kit/` -- ✅ Refactored existing functionality -- ❌ Changed only documentation or comments -- ❌ Changed only test files (tests will run in quality suite) - -**How to run tests**: -```bash -# Run tests for specific module -pytest tests/unit/test_foo.py -v - -# Or run full unit test suite (fast) -pytest tests/unit/ -x --tb=line -``` - -**Why this matters**: Catching test failures here is faster than at `/pr` time. The `/pr` quality gate will catch failures eventually, but fixing them earlier means cleaner commit history. - -## 4. Commit - -Invoke: `Skill(skill="commit", args="")` - -Pass the full distilled summary as `$ARGUMENTS` so `/commit` has real reasoning for the commit message body. - -## 5. Intermediate, Final, or Mid-Task Pause? - -At this point, a commit has been created. Now determine if the task is complete or if there's more work. - -### Check for mode hints in $ARGUMENTS - -If `$ARGUMENTS` contains: -- `"final"` → Explicitly marked as final step → Proceed to Step 6 -- `"step"` or `"intermediate"` → Explicitly intermediate → Return to workflow (see below) -- `"pause"` → Explicitly pausing → Write handover note, skip to Step 7 - -### If no explicit hint, infer from state - -Read the backlog file for this task (identified in Step 1): - -```bash -# Check task status in backlog -grep "^Status:" .agent/backlog/.md -``` - -**If Status is `DONE`**: -→ Task was already marked complete, treat as final → Proceed to Step 6 - -**If Status is `IN PROGRESS`** or other: -→ Ask user: "Is this the final step for this task, or are there more steps?" - - **Final**: Proceed to Step 6 - - **Intermediate**: Return to workflow (see below) - - **Pause**: Write handover note, skip to Step 7 - -### Route based on determination - -**Final step** (all steps in the plan are complete, or this is the only step): -→ Proceed to Step 6 (update tracking, archive) - -**Intermediate step** (more steps remain, user wants to continue): -→ Confirm: "Step committed. Continuing work in this session." -→ Return control to conversational workflow -→ Do NOT update tracking, do NOT archive, do NOT suggest next steps - -**Mid-task pause** (stopping now, will resume later): -→ Write a [handover note](#handover-notes) into the backlog file -→ Task remains in `backlog/` with status `IN PROGRESS` -→ Proceed to Step 7 (suggest next step) **without archiving** - -## 6. Update Tracking (Final Step Only) - -**Backlog file**: Set `Status: DONE` and `Completed: ` -**Move** backlog file to `archive/` (e.g. `backlog/CRA-*.md` → `archive/CRA-*.md`) -**task-tracking.md**: - - Remove the row from the Priority Queue table - - Add the task's ID + ✅ to the **Baseline** summary line in the header - - Remove this task's ID from "Depends On" column of any tasks that depended on it - - Update test count in header if tests were added -**CHANGELOG.md** (source of truth for what changed): - - Add user-facing changes to the `[Unreleased]` section under the appropriate heading (Added/Changed/Fixed/Removed) - - Write from the user's perspective — what the feature does, not implementation details - - Skip purely internal changes (dev tooling, .agent/ updates, workflow tweaks) unless they affect the installed package - -## 7. Smart Next-Step Suggestion (Final Step or Session Ending) - -After updating tracking (or after writing a handover note), read the priority queue to see what task comes next. - -Run branch-context to understand the current state: -```bash -bash scripts/branch-context.sh -``` - -**If the next task relates to the current branch's theme** (similar area, branch has few commits): -→ Suggest: "The next task ([task title]) fits this branch's theme. Open a new chat and say 'work on next task' to continue." - -**If the next task is a different area, or the branch already has several commits (3+)**: -→ Suggest: "This branch has [N] commits covering [theme]. Consider running `/pr` to close this chapter, then say 'work on next task' in a fresh session." - -**Never continue task work in the same session.** Each task = fresh context window. Always recommend opening a new chat. - -## Handover Notes - -If the session is ending before the task is fully complete, write a handover note in the backlog file. This is the next session's starting context — without it, the next session starts from scratch and may redo work, make contradictory decisions, or miss context that only existed in the conversation. - -The handover must capture: -- What has been done so far (which steps completed) -- What remains to be done (which steps are left) -- Key decisions made and why -- What was tried and didn't work -- The human's guidance and preferences from this session -- Concerns or risks identified - -The most important thing to preserve is the *reasoning* — code changes are in the working tree, but the reasoning exists only in the conversation and must be written down. diff --git a/.claude/skills/commit/SKILL.md b/.claude/skills/commit/SKILL.md deleted file mode 100644 index 22ca3d0..0000000 --- a/.claude/skills/commit/SKILL.md +++ /dev/null @@ -1,102 +0,0 @@ ---- -name: commit -description: Create a conventional commit — with atomic commit gate that checks quality before committing -allowed-tools: Bash -context: fork ---- - -Create a git commit with conventional commit format. Gathers context first, then validates atomicity and writes the commit. - -**Context**: $ARGUMENTS -(May contain: staging instructions, or commit context about what/why changes were made) - -## Step 1: Stage - -Stage all changes, excluding `.agent/` (local task tracking, never committed): - -```bash -git add --all -git reset HEAD .agent/ -``` - -- If context provides specific staging instructions: Follow those instead. - -Review what's staged: -```bash -git diff --staged -``` - -This provides the first piece of the context contract: what changed, which files, which modules. Always mechanical. - -## Step 2: Gather Context - -**Context contract** — to write a meaningful commit, the agent needs: (1) what changed (the staged diff from step 1) and (2) the full context — why it changed, whether it's one concern, whether it's self-contained. The second part serves double duty: the same context that tells the agent *why* also tells it whether the diff is atomic. Stop at the earliest tier that fulfills the contract. - -**Tier 1 — Mechanical**: `$ARGUMENTS` provides the reasoning, `git diff --staged` provides the technical details. Together they typically answer both "is this one concern?" and "why was this done?" Contract complete. Proceed to step 3. - -**Tier 2 — Agent reasoning**: No `$ARGUMENTS`. Derive context from what's available: - -1. **Staged diff** — what files changed, what code was added/removed/modified. -2. **Git log** — `git log --oneline -5` provides context for what's been happening on this branch. -3. **Session conversation** — if the agent made the changes itself, it has the full reasoning in context. - -Self-explanatory changes — clear bug fixes, renames, test additions — resolve here. If the agent can write an accurate commit body and confirm atomicity, the contract is complete. Proceed to step 3. - -**Tier 3 — Human input**: Can't confidently determine either atomicity or the "why." Present what the agent *can* see and ask for the gaps: - -- "I see [what the diff shows]. This looks like [inference]. Is this correct?" -- "What problem did this solve?" / "Were there alternatives you considered?" - -**The threshold**: Would the commit body contain fabricated reasoning if you don't ask? If yes, ask. If you can write something accurate (even if not perfectly detailed), proceed. - -## Step 3: Validate Atomicity (Atomic Commit Gate) - -With full context gathered, validate the staged diff is atomic. This is a reasoning step over the diff already loaded in step 1 — NOT a separate investigation phase. No extra file reads, no extra tool calls. If any check fails, STOP and report what needs to be fixed. - -Validate two constraints: - -**Single concern** — Do all changes relate to one logical purpose? Scan the file paths in the staged diff. Do they cluster around one area/module, or are unrelated modules mixed in? -- Red flags: changes in `adr_kit/core/` AND `adr_kit/mcp/` for unrelated reasons; a bug fix mixed with a feature; formatting-only changes alongside logic changes. -- A broad diff isn't a problem by itself (a rename across many files is fine) but is a signal to look closer at whether multiple concerns are mixed. -- How to check: look at the file paths already visible in `git diff --staged`. No extra reads needed. -→ If mixed: report which files don't belong, suggest splitting. - -**Self-contained** — Does the commit include everything needed to be complete on its own? - -- **Completeness**: Does this commit deliver standalone value, or is it half-finished work that only makes sense with a future commit? -- **Tests**: If behavior was added or changed, are tests included that cover that specific behavior — not just exist alongside it? Check if `tests/` files appear in the staged diff alongside `adr_kit/` implementation changes. Exception: pure refactors that don't change behavior, or changes to non-code files (docs, configs). - → If missing: report which implementation changes lack tests. -- **Documentation**: If the change introduces something significant — a new architectural pattern, a major design decision, a core abstraction — evaluate whether documentation exists. If not, escalate: surface what you identified as significant, propose where and how to document it, and let the human decide. Don't do a deep documentation audit — if the staged diff includes changes to public APIs (new parameters, changed return types, removed functions), check whether docstrings in those same files were updated. This is visible in the diff you already have. Don't read external doc files to cross-reference. - -If both constraints pass, proceed to step 4. - -## Step 4: Write the Commit - -Once the contract is complete and atomicity is confirmed, create the conventional commit: - -Format: `(): ` - -Body: Explain WHY (not what — that's visible in diff) - -- Types: `feat`, `fix`, `docs`, `refactor`, `test`, `chore` -- Scope: optional (e.g., `mcp`, `cli`, `core`) -- Description: imperative, lowercase, no period -- Body: Explain the reason/motivation for the change - -```bash -git commit -m "$(cat <<'EOF' -(): - -Why this change was made: -EOF -)" -``` - -The pre-commit hook will automatically check format + lint on staged files. -If it fails: inspect the hook output to see what failed and where, fix the issues, re-stage, and retry. - -## Step 5: Confirm - -```bash -git status -``` diff --git a/.claude/skills/pr/SKILL.md b/.claude/skills/pr/SKILL.md deleted file mode 100644 index f368d09..0000000 --- a/.claude/skills/pr/SKILL.md +++ /dev/null @@ -1,147 +0,0 @@ ---- -name: pr -description: Create a human-readable PR from the current feature branch -allowed-tools: Bash, Read, Glob -context: fork ---- - -Create a pull request from the current feature branch. The PR is the human review gate — it must be easy for a human to understand with minimal cognitive load. - -**Context**: $ARGUMENTS -(May contain: summary of what the branch accomplished, passed from /close or provided directly) - -## 1. Verify - -```bash -git branch --show-current -``` - -Must be on a feature branch (not `main`). If on main: STOP. Run `scripts/branch-context.sh` to gather the branch landscape. If feature branches exist, present them and ask which one to PR: - -"You're on main — no feature branch to PR from. But I see these feature branches: -- ``: [N] commits about [topic] -- ``: [N] commits about [topic] - -Which branch would you like to create a PR from?" - -If no feature branches exist: "You're on main and there are no feature branches. Nothing to PR." - -## 2. Sync with Main - -```bash -git checkout main && git pull origin main && git checkout - && git rebase main -``` - -If rebase conflicts occur: STOP and help resolve. Do not proceed with unresolved conflicts. - -## 3. Run Quality Suite - -```bash -make quality -``` - -This runs format + lint + tests (~37s). It's the final gate before the PR goes out. - -If anything fails: fix it, commit the fix, then re-run `make quality` until it passes. Do not proceed with failing quality checks. - -## 4. Gather Context - -Collect the raw material: -```bash -git log main..HEAD --format="%s%n%n%b" --reverse -``` - -**Context contract** — the agent needs four pieces: (1) why this work was done, (2) what approach was taken and why, (3) what was tested, and (4) what risks exist. Stop at the earliest tier that fulfills the contract. - -**Tier 1 — Mechanical**: `$ARGUMENTS` + `git log main..HEAD` commit bodies have the reasoning. When `$ARGUMENTS` are rich or commit bodies explain *why* — the typical case when `/commit` was used with good context upstream — the contract is fulfilled without needing the full diff. Proceed to write the PR. - -**Tier 2 — Agent reasoning**: Tier 1 produces gaps — commit bodies have the *what* but not the *why*, or trade-offs aren't documented. Gather the actual changes: - -```bash -git diff main..HEAD -``` - -Reason over the diff alongside the log. If the agent can fill the gaps confidently without fabricating, it does. Proceed to write the PR. - -**Tier 3 — Human input**: The agent would have to fabricate reasoning. Present what it *can* see: -- "This branch has N commits touching [modules]. The changes appear to [summary]." -- Ask for the gaps: "Can you explain the motivation and any key trade-offs, so the PR description is accurate?" - -Use the human's response to fill in Why and Approach. Derive What Was Tested from test files in the diff. Assess Risks from the scope of changes. - -## 5. Push + Create or Update PR - -Once the context contract is complete, push changes: - -```bash -git push -u origin HEAD -``` - -Check if a PR already exists for this branch: - -```bash -gh pr list --head $(git branch --show-current) --json number,title -``` - -**If PR exists**: Gather updated context from all commits since the PR was created, then update the PR description: - -```bash -gh pr edit --body "$(cat <<'EOF' -## Why - - -## Approach - - -## What Was Tested - - -## Risks - -EOF -)" -``` - -Report: "PR #N updated with new commits and description. Review at [URL]" - -**If no PR exists**: Create a new one: - -```bash -gh pr create --title "" --body "$(cat <<'EOF' -## Why -<2-3 sentences: what problem this solves and why it matters now> - -## Approach - - -## What Was Tested - - -## Risks - -EOF -)" -``` - -**PR content rules** — only include what GitHub doesn't already show: -- Don't list files changed (GitHub shows that) -- Don't say "N tests passed" (CI shows that) -- Focus on reasoning, approach, test coverage, and risks - -## 6. Switch to Main - -```bash -git checkout main && git pull origin main -``` - -## 7. Report - -Output the PR URL and: -"PR created. Review and merge in GitHub, then open a new chat and say 'work on next task' to continue." diff --git a/.gitignore b/.gitignore index 192aaeb..9dd9df9 100644 --- a/.gitignore +++ b/.gitignore @@ -79,7 +79,7 @@ docs/** # Test ADRs and artifacts (project docs are in guide/ ins .mcp.json # IDE integration files -.claude/settings.local.json +.claude/ .cursor/ # Generated lint config examples (keep actual configs) diff --git a/CLAUDE.md b/CLAUDE.md index 0bd8b18..7483ead 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -255,9 +255,24 @@ exclude = ["tests*", "scripts*", ".agent*"] - Type safety with Pydantic/FastMCP - Backward compatibility for public APIs +## Python Executable + +Use `python3` (or the value from `.agent/config.json` → `python` key) for all script invocations. Never use bare `python` — it may resolve to Python 2 on some systems. + +## Task Workflow — Session Protocol + +**When starting work on a task** (e.g., "work on next task"), follow the session protocol in [`.agent/CLAUDE.md`](.agent/CLAUDE.md). That file is the authoritative reference for the task workflow system — read it first. + +**Quick summary** (details in `.agent/CLAUDE.md`): +1. Run `python3 .agent/scripts/next_task.py` to get the next task +2. Confirm with user, invoke `/branch` +3. Research, plan, decompose into atomic steps +4. Implement step by step, `/close` after each step +5. Never work directly on main — always use a feature branch + +**Git workflow design**: [`.agent/workflows/git-workflow.md`](.agent/workflows/git-workflow.md) (human-readable, loaded only on request) + ## Additional Documentation -- **Development workflows**: See `guide/dev/` for task workflow and git workflow guides - - **CRITICAL**: When starting work on a task (e.g., "work on next task"), follow the full **conversational workflow** in [`guide/dev/task-workflow.md`](guide/dev/task-workflow.md#conversational-workflow--clean-start) - all 5 steps are mandatory, especially step 3 which invokes `/branch` to create a feature branch. Never work directly on main. - Performance targets, security requirements: `docs/requirements.md` - ADR format specification: Examples in `tests/fixtures/` \ No newline at end of file diff --git a/guide/dev/git-workflow.md b/guide/dev/git-workflow.md deleted file mode 100644 index e3c9a7f..0000000 --- a/guide/dev/git-workflow.md +++ /dev/null @@ -1,485 +0,0 @@ -# Git Workflow — Design & Architecture - -This document captures the *why* and *how* of the agent-driven git workflow used in this project. It's not a how-to guide — the skill files encode the operational details. This is for understanding the design, assumptions, and reasoning. - -The workflow is designed to guide agent behavior so that git operations are handled with consistent quality — automatically. The human's role narrows to the parts only a human can provide: the reasoning behind changes, judgment on trade-offs, and approval before code reaches main. Everything mechanical — branch decisions, commit structure, quality checks, PR formatting — is handled by the agent following encoded rules. - -> **Implementation note**: This workflow is built around [Claude Code](https://docs.anthropic.com/en/docs/claude-code) and its skills system (`.claude/skills/`). Throughout this document, Claude Code is referred to as *the agent*. The *principles* — trunk-based development, atomic commits, structured PRs, quality layering — are universal and adaptable to any AI-assisted development tool. The specific implementation (skills, `$ARGUMENTS` passing, `Skill()` invocations) uses Claude Code's features. - -> **Makefile commands**: This project uses a [Makefile](../../Makefile) to bundle quality checks and development commands. Throughout this document, references like `make quality`, `make test-unit`, etc. are Makefile targets — run `make help` for the full list. - -> **Status**: This workflow is being established as the standard going forward. Earlier commit history may not reflect it — that's the gap this workflow fills. - -## The Problem - -Agents can follow git best practices — but only if you encode them. Without an explicit workflow, each session starts from scratch and the agent improvises git operations with varying quality. The human ends up either micro-managing every commit message and branch decision, or accepting inconsistent output. - -This workflow encodes the discipline once so it's applied automatically every time. The investment is paid upfront — defining the rules — and the returns compound with every operation: each well-structured commit feeds into a better PR description, each clean branch keeps the history navigable, each quality gate catches problems before they become expensive to fix. - -### Why This Matters - -A well-maintained git history enables debugging with `git bisect`, understanding decisions through `git blame`, safe reverts, and efficient code review. When git discipline is skipped, the cost compounds — each vague commit makes the next debugging session slower, each missing PR description makes the next review harder. This workflow inverts that dynamic: the benefits compound instead of the costs. - -## Three Levels of Granularity - -This is the core mental model. Every git operation in this workflow maps to one of three levels, each serving a distinct purpose with clear boundaries: - -### Commit = Atomic Unit of Work - -A commit represents one logical change that delivers standalone value: - -- **Atomic**: Contains everything needed — code, tests, docs. Never half-finished. -- **Traceable**: The message explains *what* and *why*, not just *how*. -- **Revertable**: Can be cleanly undone without side effects. -- **Bisectable**: `git bisect` can pinpoint this commit because it doesn't bundle unrelated changes. - -Small, focused commits reduce integration risk. Each one is a checkpoint — if something breaks, you know exactly where and can reason about what changed. - -**Example** — a well-structured conventional commit message: - -``` -feat(knowledge): add LRU caching to KnowledgeLoader - -Loading evaluation criteria from disk on every preflight call added -~200ms latency. LRU cache eliminates repeated reads within a session. - -Chose LRU over TTL because data files are static within a session. -Cache invalidates on file change detection. Not invalidated for -mid-session edits — acceptable since data files only change between -releases. - -Includes unit tests for empty cache, full cache, and invalidation. -``` - -The subject line tells you *what*. The body tells you *why this approach*, *what alternatives were considered*, and *what was tested*. This body becomes raw material for the PR description downstream — good commit messages compound into good PRs. - -### Branch = Thematic Group ("Chapter") - -A branch groups related commits into a coherent chapter of work. Examples: -- `feat/ai-steering-enhancements` — multiple commits improving different aspects of AI steering -- `fix/schema-validation-edge-cases` — several edge case fixes in the same system -- `docs/git-workflow-design` — documentation work on a single topic - -**Why group, not one commit per PR?** Grouping related work reduces PR overhead and provides contextual coherence — the reviewer sees a complete story, not scattered changes. - -**How many commits?** Typically a handful — enough to tell a coherent story, few enough to keep the branch short-lived. This isn't a fixed boundary; a branch might have one commit or several, depending on how much work belongs to the theme. The principle is: short-lived branches minimize merge conflicts and integration risk. If a branch grows large, that's a signal to split and PR sooner. - -### PR = Human Review Gate - -The PR is where human judgment enters. It's the boundary between "work was done" and "work is approved for main": - -1. **Agent-generated code needs human oversight** — the agent may make technically correct but architecturally wrong choices. -2. **The description surfaces reasoning** — not just what changed (the diff shows that) but *why* and *what trade-offs were made*. -3. **It creates a permanent record** — the PR conversation becomes part of the project's decision history. - -**The key constraint**: If a new task doesn't fit the current branch's theme, the current branch should be PR'd and merged first. This keeps main as the source of truth and prevents branches from forking off unreviewed work. - -**Example** — a PR description that serves the reviewer: - -```markdown -## Why - -KnowledgeLoader reads evaluation criteria from disk on every preflight -call, adding ~200ms per invocation. For projects with frequent preflight -checks, this adds up noticeably. - -## Approach - -Added LRU caching keyed by file path. Chose LRU over TTL because data -files are static within a session — they only change between releases. -Cache invalidates when file modification time changes. - -Considered a global module-level cache but rejected it because it would -persist across test runs and cause flaky tests. Instance-level cache on -KnowledgeLoader keeps the lifecycle clean. - -## What Was Tested - -- Unit tests: empty cache hit, populated cache hit, cache miss after - file change, cache eviction at capacity -- Integration test: full preflight cycle with caching enabled -- Manual: verified ~200ms → ~2ms on second call in dev environment - -## Risks - -- Cache is not invalidated if data files are edited mid-session (by - design — documented in docstring) -- LRU size is hardcoded at 128 entries — sufficient for current use but - may need configuration if projects grow significantly -``` - -This format focuses on what GitHub doesn't already show. File lists, test pass counts, and implementation details visible in the diff are deliberately excluded — the reviewer can see those directly. What they can't see is *why this approach*, *what was rejected*, and *where to focus attention*. - -## Why Trunk-Based Development - -Based on trunk-based development principles ([trunkbaseddevelopment.com](https://trunkbaseddevelopment.com), Martin Fowler's branching patterns): - -- **Main is always the deployable source of truth.** All changes arrive via PR. This is especially important when agents are generating code — the PR gate ensures a human reviews before anything reaches users of the library. -- **Feature branches are short-lived.** They group thematically related work and are deleted after merge. Short branches minimize divergence and merge conflicts. -- **Each commit is atomic and standalone.** Even on a multi-commit branch, any single commit delivers value. - -**Why not GitFlow?** GitFlow adds ceremony (develop, release, hotfix branches) suited for projects with formal release trains. For a library with continuous delivery, trunk-based is simpler and faster. - -**A note on overhead**: Requiring PRs for every change adds friction, especially for a solo developer. This workflow makes that trade-off deliberately — the library is used by others, and the PR gate ensures nothing reaches main without human review, particularly when the agent is doing the implementation. The goal is staying in control, not adding ceremony. The workflow is designed so that the mechanical parts — branch decisions, commit structure, quality checks, PR formatting — are handled by the agent. The human's only manual step is reviewing and merging the PR in GitHub, which is the whole point. The overhead lives in the automation, not on the developer. - -## Quality Layering - -Quality is enforced through six layers, each catching a different class of issues at increasing scope. This follows the **shift-left principle**: the earlier in the pipeline a problem is caught, the cheaper it is to fix. A missing "why" caught at the commit context check costs a quick question. A mixed-concern commit caught at the atomic gate costs seconds to split. The same problem discovered during PR review costs a conversation. Discovered after merge, it costs a revert and a new PR cycle. Each layer exists to catch problems at the cheapest possible point: - -| Layer | What It Catches | When It Runs | How | -|-------|----------------|--------------|-----| -| **Commit context check** | Missing reasoning — vague or fabricated commit messages | Before each commit | Agent validates context contract: the "why" is available (via tiers 1–3) before writing | -| **Atomic commit gate** | Structural problems — mixed concerns, missing tests, incomplete work | Before each commit | Agent validates the staged diff is one concern, self-contained, with tests and docs | -| **Pre-commit hook** | Style problems — formatting, linting violations | On each `git commit` | `scripts/pre-commit` runs automatically via git hooks on staged files | -| **`make quality`** | Integration problems — test failures, cross-module issues | Before each PR | Full format + lint + test suite | -| **PR context check** | Missing reasoning — thin context that would produce fabricated PR descriptions | Before PR description is written | Agent validates context contract: Why / Approach / Tested / Risks are filled (via tiers 1–3) | -| **CI** | Environment problems — dependency issues, platform-specific failures | On PR creation/update | GitHub Actions in a clean environment | - -Each layer has a distinct job. The commit context check ensures the agent has real reasoning before writing a commit message — no context, no commit. The atomic gate ensures the staged diff is structurally sound. The pre-commit hook enforces code style automatically — git runs `scripts/pre-commit` on every commit, and if it fails, the commit is blocked until the issues are fixed. When the hook fails, its output should be **diagnostic, not prescriptive** — it reports *what* failed and *where* so the agent can reason about the fix itself, rather than blindly following a hardcoded recovery command. `make quality` validates the branch as a whole before the agent pushes. The PR context check ensures the description has real reasoning, not fabricated filler. CI verifies in a clean environment that nothing was missed locally. - -### Test Verification Points - -Tests run at three points in the workflow, each with different scope and purpose: - -1. **During `/close` (optional)**: Agent can run tests before commit if implementation changed - - **When**: Before invoking `/commit` in the `/close` skill - - **Scope**: Relevant tests for the changed module (fast subset) - - **Command**: `pytest tests/unit/test_foo.py -v` or `pytest tests/unit/ -x --tb=line` - - **Purpose**: Catch test failures early, before commit - - **Trade-off**: Faster feedback loop vs. optional (agent decides based on change type) - -2. **Pre-commit hooks (optional)**: Projects can enable test-running in hooks - - **When**: On each `git commit` (if configured in `.git/hooks/pre-commit`) - - **Scope**: Changed modules only (to keep it fast) - - **Purpose**: Block commits with failing tests automatically - - **Trade-off**: Guaranteed safety vs. slower commits - - **Note**: Not enabled by default in this project due to performance impact - -3. **During `/pr` (mandatory)**: `make quality` includes full test suite - - **When**: Before pushing and creating PR - - **Scope**: Full test suite (`pytest tests/`) - - **Command**: `make quality` (format + lint + tests) - - **Purpose**: Final gate before code leaves local machine - - **Trade-off**: Comprehensive but slower (~37s) - -**Recommended approach**: Run tests at `/close` time for implementation changes. This catches failures earlier than `/pr` without slowing down docs-only commits. The `/pr` quality gate ensures nothing is missed. - -## How It All Fits Together - -```mermaid -sequenceDiagram - participant H as Human - participant B as /branch - participant I as Implement - participant C as /commit - participant P as /pr - participant GH as GitHub - - H->>B: Start work (with context) - B->>B: Evaluate branch state - B-->>I: Create or continue branch - - loop For each atomic unit of work - I->>I: Write code + tests - I->>C: /commit (with "why" context) - C->>C: Atomic gate (reason over diff) - C->>C: Pre-commit hook (format + lint) - C-->>I: Committed - end - - H->>P: Ready for review - P->>P: Rebase on main - P->>P: make quality (full suite) - P->>P: Gather PR context (contract) - P->>P: Push branch - P->>GH: Create PR (Why / Approach / Tested / Risks) - GH->>GH: CI runs - H->>GH: Review & merge -``` - -The human initiates work and reviews at the end. Everything in between — branch decisions, commit structure, quality gates, PR formatting — is handled by the agent following the encoded workflow. The human provides judgment; the agent provides the mechanical discipline. - -## The Git Skills - -Three skills handle all git operations. Each is independently usable — you can invoke `/commit` without `/branch`, or `/pr` at any point. They compose but don't require each other. - -| Skill | Responsibility | -|-------|---------------| -| `/branch` | Evaluate branch state, create/continue/escalate | -| `/commit` | Stage changes, enforce atomic quality, write conventional commit | -| `/pr` | Sync with main, run quality suite, push, create PR | - -Each skill follows the same two-phase structure: **gather context**, then **act**. The context gathering follows three tiers — mechanical, agent reasoning, human input — stopping as soon as the skill has what it needs. Each skill defines a **context contract**: the minimum information required to take its action confidently. The three tiers are how that contract gets fulfilled. - -### `/branch` — Autonomous Branching - -The `/branch` skill manages branching decisions through three steps: gather the branch landscape, gather the task theme, then decide and act. - -#### Step 1: Gather Branch Landscape + Cleanup - -`scripts/branch-context.sh` fetches from remote (to ensure `origin/main` is current), gathers the state of all local branches, and cleans up stale branches — all in one call without switching branches. A branch is stale when all its commits are already in main (`git branch --merged main`). Stale branches are deleted; branches with commits not in main are work in progress and kept. The agent reports what it removed. - -After cleanup, the script outputs the remaining landscape: current branch state + any surviving WIP branches. This fulfills the first part of the context contract. A single script call replaces what would otherwise be multiple fragile tool calls. - -#### Step 2: Gather Task Theme - -**Context contract** — to make a branching decision, the agent needs: (1) the branch landscape (from step 1) and (2) the task theme. The task theme is where the tiers apply: - -**Tier 1 — Mechanical**: `$ARGUMENTS` provides the task theme directly. Contract complete. - -**Tier 2 — Agent reasoning**: No `$ARGUMENTS`. The agent tries to derive the task theme from what it already has, in this order: - -1. **Branch landscape** — if WIP branches exist, their names, commit subjects, and changed files suggest the thematic area. If there's only one WIP branch, it's likely where work should continue. -2. **Conversation context** — what the human said earlier in this session may indicate what they want to work on. -3. **Task tracking** — `.agent/task-tracking.md` lists the current priority queue, which may indicate the next task's theme. (Only available when using the [Task Workflow](./task-workflow.md).) - -If the agent can confidently determine the theme from these sources, the contract is complete. - -**Tier 3 — Human input**: The agent has partial information but isn't confident. It presents what it knows — "I see [WIP branch X] with commits about [topic], and the next task in the queue is [Y]" — and asks specifically for what's missing. - -#### Step 3: Decision and Action - -Only once the contract is complete — branch landscape + task theme — does the agent decide: - -- **Recommend `/pr` first**: When a WIP branch exists that doesn't fit the task. New branches should always start from the latest main — working on a second branch in parallel risks conflicts and divergence. -- **Rename**: When the current branch name is too narrow for the task theme (e.g., named after one specific task but the theme is broader). Only for unpushed branches — renaming a pushed branch is disruptive. - -The full decision matrix is in the [escalation table](#autonomous-vs-escalation--design-intent) below. - -### `/commit` — Conventional Commits with Reasoning - -The `/commit` skill creates conventional commit messages (`(): `) with a body explaining *why*. It gathers context first, then uses that context for both atomicity validation and message writing. - -**Context contract** — to write a meaningful commit, the agent needs: - -1. **What changed** — the staged diff -2. **The full context** — why it changed, whether it's one concern, whether it's self-contained - -The second part serves double duty: the same context that tells the agent *why* also tells it whether the diff is atomic. Once gathered, it feeds both the validation and the message. - -#### Step 1: Stage - -The agent stages all changes, *excluding* `.agent/` which must never be committed (it contains local task tracking and session state). `git diff --staged` provides the first piece of the contract: what changed, which files, which modules. Always mechanical. - -#### Step 2: Gather Context - -The agent gathers enough context to both validate atomicity and write the commit message. The tiers apply: - -**Tier 1 — Mechanical**: `$ARGUMENTS` provides the reasoning, `git diff --staged` provides the technical details. Together they typically answer both "is this one concern?" and "why was this done?" Contract complete. - -**Tier 2 — Agent reasoning**: No `$ARGUMENTS`. The agent derives context from what it has: - -1. **Staged diff** — what files changed, what code was added/removed/modified. -2. **Git log** — recent commits provide context for what's been happening on this branch. -3. **Session conversation** — if the agent made the changes itself, it has the full reasoning in context. - -Self-explanatory changes — clear bug fixes, renames, test additions — resolve here. The agent can typically determine both atomicity and the "why" from these sources. - -**Tier 3 — Human input**: The agent can't confidently determine either atomicity or the "why." It presents what it *can* see and asks for the gaps — whether that's "are these one concern or should they be split?" or "what problem did this solve?" False information in a commit message is worse than a brief question. - -#### Step 3: Validate Atomicity - -With full context gathered, the agent validates the staged diff is atomic: - -**Single concern** — Do all changes relate to one logical purpose? File paths clustering around one area is a good signal; unrelated modules appearing together is a red flag. A broad diff isn't a problem by itself (a rename across many files is fine) but is a signal to look closer at whether multiple concerns are mixed. - -**Self-contained** — Does the commit include everything needed to be complete on its own? - -- **Completeness**: Does this commit deliver standalone value, or is it half-finished work that only makes sense with a future commit? -- **Tests**: If behavior was added or changed, are tests included that cover that specific behavior — not just exist alongside it? -- **Documentation**: If the change introduces something significant — a new architectural pattern, a major design decision, a core abstraction — the agent evaluates whether documentation exists. If not, it escalates: surfaces what it identified as significant, proposes where and how to document it, and lets the human decide. - -If the gate fails, the agent stops and reports what needs to be fixed. The gate is advisory — if the human explicitly overrides, it proceeds. - -#### Step 4: Write the Commit - -Once the contract is complete and atomicity is confirmed, the agent writes the conventional commit message: `(): ` with a body explaining *why*. The pre-commit hook runs automatically — if it fails, the agent inspects the output, fixes the issues, re-stages, and retries. - -#### Step 5: Confirm - -After the commit succeeds, the agent runs `git status` to verify the commit was applied cleanly — no unexpected unstaged leftovers, no partial commits. This is a quick sanity check that catches silent failures. - -### `/pr` — Human-Readable Pull Requests - -The `/pr` skill creates the PR that serves as the [human review gate](#pr--human-review-gate). It follows six steps: verify the starting point, sync and validate, gather context, write the description, create the PR, and hand off to the human. - -#### Step 1: Verify - -The agent confirms it's on a feature branch (not `main`). If on main, it stops — but doesn't just say "nothing to PR." It runs `scripts/branch-context.sh` to gather the branch landscape. If feature branches exist, it presents them with their commit counts and topics, and asks which one the user wants to PR. If no feature branches exist, it reports that clearly. This turns a dead end into an actionable prompt. - -#### Step 2: Sync + Validate - -The agent rebases the feature branch on main (stopping for the human if conflicts occur) and runs `make quality` — the last local gate before code leaves the developer's machine. This ensures the branch is rebased and all checks pass before proceeding. - -#### Step 3: Gather Context - -**Context contract** — the agent needs four pieces: (1) why this work was done, (2) what approach was taken and why, (3) what was tested, and (4) what risks exist. - -**Tier 1 — Mechanical**: The agent collects `$ARGUMENTS` and `git log main..HEAD` (commit subjects and bodies). When `$ARGUMENTS` are rich or commit bodies explain *why* — the typical case when `/commit` was used with good context upstream — the contract is fulfilled without needing to load the full diff. - -**Tier 2 — Agent reasoning**: Tier 1 produces gaps — commit bodies have the *what* but not the *why*, or trade-offs aren't documented. The agent gathers `git diff main..HEAD` and reasons over the actual changes alongside the log. If it can fill the gaps confidently without fabricating, it does. - -**Tier 3 — Human input**: The agent would have to fabricate reasoning. It presents what it *can* see ("This branch has N commits touching [modules]. The changes appear to [summary].") and asks for what's missing. - -#### Step 4: Write the PR Description - -Once the contract is complete, the agent uses the gathered context to write the structured description ([format above](#pr--human-review-gate)). The description focuses on what GitHub doesn't already show — not file lists or test counts, but reasoning, trade-offs, and where to focus attention. - -#### Step 5: Create the PR + Hand Off - -The agent pushes the branch, creates the PR with the description, and outputs the PR URL so the human can click through directly to review and merge. - -#### Step 6: Switch to Main - -The agent switches back to main and pulls to ensure it's current. This keeps main up to date for whatever comes next — whether that's `/branch` for new work or manual commands. - -### How Context Contracts Compound - -Each skill's context contract is partially fulfilled by the output of the previous skill. When `/branch` creates a well-named thematic branch, `/commit` has clearer scope signals — its contract is easier to fill from Tier 1. When `/commit` produces rich commit bodies, `/pr`'s contract (Why / Approach / Tested / Risks) is largely fulfilled by `git log` alone — Tier 1 again. - -Conversely, thin upstream output cascades: vague commit messages leave `/pr`'s contract unfulfilled at Tier 1, forcing it into reasoning (Tier 2) or human input (Tier 3). The investment at each skill pays forward to the next one — this is the compounding effect in practice. - -## Autonomous vs. Escalation — Design Intent - -The workflow is designed around a principle: **the agent decides by itself when the answer is clear, and asks the human when genuinely uncertain.** Escalation is not failure — it's the agent being honest about its limits rather than fabricating. - -| Situation | Intended Action | -|-----------|----------------| -| Stale branches (merged into main) | Clean up silently, report what was removed | -| On main, no WIP branches, has context | Create branch silently | -| On main, WIP branch fits task theme | Switch to it silently | -| On main, WIP branch doesn't fit task | Escalate: recommend finishing it first (`/pr`) | -| On main, multiple WIP branches | Escalate: present landscape, let human decide | -| On feature, task fits current branch | Continue silently | -| On feature, branch name too narrow (unpushed) | Rename silently | -| On feature, task clearly doesn't fit | Escalate: recommend `/pr` first | -| On feature, ambiguous fit | Escalate: present context, let human decide | -| Context provided for commit | Write conventional commit silently | -| No context, changes self-explanatory | Derive from diff, proceed | -| No context, changes ambiguous | Escalate: present what it sees, ask for the "why" | -| Changes span multiple concerns | Escalate: suggest splitting | -| Rebase conflicts | Escalate: help resolve | -| Quality check fails | Fix silently, retry | -| Commit bodies have reasoning | Assemble PR description silently | -| Commit bodies are thin | Derive what it can, ask for approach/trade-offs | - -**How escalation works in practice**: The agent never asks a blank "what should the commit message be?" It presents what it *can* see (the diff, likely scope, its inference), and asks only for the gap. This makes escalation cheap for the human — confirm or correct, rather than explain from scratch. - -**A note on reliability**: This table describes the *intended* behavior — the design target. The quality of these decisions depends on how well the agent interprets the skill instructions in practice. The workflow encodes the target; consistent results require testing and refinement over time. - -## How the Pieces Compose - -### Standalone Usage - -Each git skill works independently. The text after the command becomes `$ARGUMENTS` — the context the skill uses: - -``` -# Just commit current changes: -/commit "Refactored the parser to handle nested YAML blocks" - -# Just create/evaluate a branch: -/branch "Working on schema validation improvements" - -# Just create a PR from the current branch: -/pr "This branch adds the knowledge module with caching and graceful degradation" -``` - -### Multi-Session Flow - -A typical multi-session project flow: - -``` -Session 1: - /branch "schema validation" → implement → /commit "why + what" - -Session 2: - /branch "still schema work" → continues → /commit "why + what" - -Session 3: - /pr → rebase, quality check, push, create PR - (human merges in GitHub) - /branch "new topic" → implement → /commit -``` - -## Troubleshooting: Tests Failing at `/pr` Time - -If tests fail when you run `/pr`, but you don't remember breaking them, follow this recovery process: - -### 1. Identify Which Commit Broke Tests - -```bash -# View recent commits on this branch -git log main..HEAD --oneline - -# Check out each commit and run tests -git checkout -pytest tests/unit/ # or the specific test that's failing -``` - -Repeat for each commit until you find the one that introduced the failure. - -### 2. Fix the Tests - -Return to your branch head: -```bash -git checkout -``` - -Fix the failing tests based on what you found. - -### 3. Decide How to Commit the Fix - -**Option A: Add fix as new commit (recommended)** -- Safer, preserves full history -- Shows that the issue was caught and fixed -- Command: Standard commit via `/close` or `/commit` - -**Option B: Amend the breaking commit (advanced)** -- Cleaner history, but requires rewriting commits -- Only use if the branch hasn't been pushed yet -- Commands: - ```bash - git rebase -i main - # Mark the breaking commit for 'edit' - # Make your fix - git add . - git commit --amend - git rebase --continue - ``` - -**Recommendation**: Use Option A (new commit). It's safer and preserves the full development history. The PR reviewer can see that tests failed and were fixed. - -### 4. Re-run `/pr` - -Once tests are fixed: -```bash -# Verify tests pass locally -pytest tests/unit/ -x --tb=line - -# Or run full quality suite -make quality -``` - -Then invoke `/pr` again. It will: -- Rebase on main (if needed) -- Run `make quality` (should pass now) -- Create/update the PR - -### Prevention: Run Tests During `/close` - -To catch test failures earlier (before `/pr` time): - -**For implementation changes**: -- Run relevant tests before calling `/close` -- The `/close` skill documentation includes guidance on when to run tests -- Example: `pytest tests/unit/test_foo.py -v` before committing - -**For docs-only or test-only changes**: -- Skip test-running at `/close` time -- Tests will still run at `/pr` time (mandatory) - -See [`.claude/skills/close/SKILL.md`](../../.claude/skills/close/SKILL.md) Step 3 for the full test verification guidance. - -### Integration with Other Systems - -The git skills are designed to be standalone. When used with a task workflow (like the [Task Workflow](./task-workflow.md)), richer context flows in — session summaries, task specifications, structured reasoning — which produces better commit bodies and PR descriptions. But the git skills don't *require* this. They work with whatever context is available, whether it comes from a task system, from the human directly, or from the agent's own analysis of the diff. diff --git a/guide/dev/task-workflow.md b/guide/dev/task-workflow.md deleted file mode 100644 index cea8ddb..0000000 --- a/guide/dev/task-workflow.md +++ /dev/null @@ -1,194 +0,0 @@ -# Task Workflow — Design & Architecture - -This document captures the *why* and *how* of the agent-driven task workflow used in this project. It covers how work is loaded, how sessions are bounded, how persistent state bridges the gap between stateless sessions, and how the workflow integrates with other workflows. - -> **Implementation note**: This workflow is built around [Claude Code](https://docs.anthropic.com/en/docs/claude-code) and its skills system (`.claude/skills/`). Throughout this document, Claude Code is referred to as *the agent*. The *principles* — structured task loading, bounded sessions, context passing — are adaptable to any AI-assisted development tool. The specific implementation uses Claude Code's skills, `$ARGUMENTS` passing, and `Skill()` invocations. - -> **Status**: This workflow is being established as the standard going forward. - -## The Problem - -Agent sessions are stateless. Each new conversation starts with a blank context — the agent doesn't know what you were working on, what's done, what's next, or what decisions were made in previous sessions. Without structure, every session begins with the human re-explaining context, and the agent has no sense of priority or progress. - -This creates three problems: - -1. **No direction**: The agent doesn't know what to work on, in what order, or how to break a large goal into actionable steps. Without orchestration, the human becomes the project manager on every single session. -2. **No continuity**: Decisions made in session 1 (design choices, trade-offs, things tried and rejected) are lost by session 2. The agent may revisit rejected approaches or make contradictory choices. -3. **No accountability**: Without tracking what's done and what's left, work falls through the cracks. Completed work isn't documented properly because the agent doesn't know it's about to finish. - -## The Solution - -The task workflow solves this by owning the two boundaries of every session: a **clean start** and a **clean finish**, connected by persistent state that lives outside the agent's context window. - -**Clean start** (conversational workflow): Load the right task from a prioritized queue, provide the relevant context, and confirm direction with the human — so the agent begins every session knowing exactly what to work on and why. - -**Clean finish** (`/close`): Capture the session's accumulated knowledge, update the persistent state, and hand off to any integrated workflows — so nothing is lost and the next session can pick up seamlessly. - -The [persistent state](#persistent-state--the-core-mechanism) (files in `.agent/`) is what makes both possible. It's the memory that bridges sessions. - -What happens *before* the session (creating and prioritizing backlog tasks) and what happens *during* (the actual implementation) are not the task workflow's core concern. Currently, backlog files are written manually and the conversational workflow includes basic task decomposition. Both are areas where additional tooling may be built. But the core value — ensuring every session starts clean and ends clean — is what the task workflow guarantees. - -The task workflow operates independently. It can also [integrate](#integrations) with other workflows that benefit from structured task context. The [Git Workflow](./git-workflow.md) is the first such integration, receiving session context at commit time. - -## Persistent State — The Core Mechanism - -The agent's context window is ephemeral — it exists only during a session. Everything that needs to survive across sessions lives in `.agent/`: - -``` -.agent/ -├── task-tracking.md # Priority queue + done history -├── backlog/ # Task specifications (one file per task) -├── archive/ # Completed task specs (moved from backlog/) -├── architecture.md # System-level design guidance -└── vision.md # Project goals and direction -``` - -**`task-tracking.md`** is the priority queue. It contains a table of tasks ordered by priority, each with a status, a link to its backlog file, and a "Depends On" column. The conversational workflow reads it to find the next unblocked task. `/close` updates it when work completes — marking tasks done and unblocking dependents. It also has a "Done" section serving as a chronological record of completed tasks. - -**`backlog/*.md`** files are task specifications — one file per task. Each describes what needs to happen, why it matters, acceptance criteria, and any relevant context. These are typically written by the human, though the agent may create them when decomposing larger goals. A backlog file is the primary input to the task workflow — it's what the agent reads to understand what to do and why. - -**`archive/*.md`** is where completed backlog files go. `/close` moves them here on task completion. This preserves the specification and any notes added during implementation, so future sessions can reference past decisions. - -**`architecture.md`** and **`vision.md`** are long-lived context files. They don't change per-task — they provide the broader design guidance and project direction that the conversational workflow reads alongside the task spec. The agent uses these to make implementation decisions that align with the project's overall direction. - -### The Task Lifecycle - -A task flows through a defined lifecycle across sessions: - -``` -Human writes task spec - → backlog/task-name.md (with requirements, acceptance criteria, context) - → task-tracking.md (added to priority queue with dependencies) - -Conversational workflow picks up highest-priority unblocked task - → reads backlog file, loads context, decomposes into steps - → agent implements step by step - -/close after each step - → captures session knowledge, delegates to integrations (e.g. /commit) - → on final step: moves backlog file to archive/, updates tracking, unblocks dependents -``` - -This lifecycle is the backbone of the workflow. Every session starts by reading the persistent state, every session ends by updating it. The files *are* the continuity. - -### Handover Notes — Bridging Sessions - -When a session ends before the task is fully complete (context window running low, or the task is too large for one session), the workflow must write a handover note. This goes into `task-tracking.md` or the backlog file and captures: - -- What has been done so far -- What remains to be done -- Key decisions made and why -- What was tried and didn't work -- The human's guidance and preferences from this session -- Concerns or risks identified - -This handover note is the next session's starting context. Without it, the next session starts from scratch and may redo work, make contradictory decisions, or miss context that only existed in the dying session's conversation. The most important thing to preserve is the *reasoning* — code changes are in the working tree, but the reasoning exists only in the conversation and must be written down. - -## Design Principles - -### One Task Per Session - -Each session (= one chat with the agent) focuses on one task. This is deliberate: - -- **Context window efficiency**: Agents work best when their context is focused. Loading multiple tasks creates noise and increases the chance of the agent mixing concerns. -- **Clear boundaries**: The session has a defined start (load task) and end (update tracking). -- **Fresh context**: Each task gets a new chat. Accumulated context from previous tasks creates noise and risks assumptions carrying over. - -One task may involve several steps, each completed and handed off separately. The task workflow breaks a task into **atomic steps** — self-contained chunks that each deliver standalone value with implementation, tests, and documentation. This serves session safety (each step must fit the remaining context window) and, when a git integration is active, [git quality](./git-workflow.md#commit--one-task-atomic-unit-of-work) (each step maps to an atomic commit). - -**Example**: The task "Add caching to KnowledgeLoader" might break into: -1. Step 1: Add LRU cache data structure with unit tests -2. Step 2: Integrate cache into KnowledgeLoader with integration tests -3. Step 3: Add cache invalidation and graceful degradation - -Each step is self-contained. Together they form a coherent body of work for the task. - -### Context Window Awareness - -A session has a finite context window. The workflow plans for this at two points: - -**During decomposition**: Steps are sized so each can be completed within the remaining context budget. If a task is too large for one session, the conversational workflow plans only the steps that fit and notes remaining work for the next session. - -**During implementation**: If the session approaches its context limit mid-step, the agent must not silently let the session end with unfinished work. Instead: - -1. **Complete what's possible** — If the current step is in a good state, finalize it via `/close`. -2. **If mid-step, write a handover** — Capture the session's knowledge in a [handover note](#handover-notes--bridging-sessions) so the next session can continue seamlessly. - -**The principle**: Never start an atomic step that can't be finished in the remaining context. If the session must end mid-work, the handover must be rich enough for the next session to continue as if it had been there all along. - -## The Task Workflow - -The workflow manages the session boundaries through conversational orchestration and the `/close` skill: - -| Component | Boundary | Responsibility | -|-----------|----------|---------------| -| **Conversational workflow** | **Start** | Load task, provide context, confirm direction, set up the session | -| **`/close` skill** | **Finish** | Capture knowledge, update tracking, hand off to integrations | - -### Conversational Workflow — Clean Start - -**When to use**: Every time you start working on a task from the priority queue. This includes when the user says "work on next task" or similar phrases. - -**Critical**: ALL five steps below must be executed in order. Do not skip steps (especially step 3: `/branch` invocation). Even for documentation-only or "simple" tasks, the full workflow ensures proper git discipline and tracking. - -The agent orchestrates the task startup conversationally, following these steps: - -1. **Check for uncommitted work** — If the working tree is dirty, address it first. Never start a new task with leftover changes. -2. **Load the next task** — Read `.agent/task-tracking.md`, find the highest-priority unblocked task (first row where "Depends On" = "—"). Read the linked backlog file for the full specification. Display the task summary and wait for confirmation before proceeding — the human may reprioritize or choose a different task. (Priorities change — a quick confirmation prevents wasting a session on the wrong task.) -3. **Delegate to integrations** — Hand off to any integrated workflows that need to act at session start. Currently: invoke `/branch` with the task title and thematic area. **This step is mandatory - never work directly on main**. Even for documentation-only changes, proper branching ensures clean git history and enables the PR review gate. (See [Git Workflow integration](#git-workflow).) -4. **Research, plan, and decompose** — Read architectural context (`.agent/architecture.md`, `.agent/vision.md`, relevant source files). Break the task into atomic steps using TodoWrite — each step designed upfront as one concern with implementation + tests + docs, so the agent knows what each step contains and when to pause before moving on. Each step should be self-contained and [completable in the remaining context](#context-window-awareness). If the task is too large for one session, plan only the steps that fit and note remaining work for the backlog file. -5. **Implement step by step** — After each step is complete (implementation + tests passing), invoke `/close` to finalize with context. If more steps remain and context allows, continue. If context is running low, prioritize finalizing cleanly and writing a [handover note](#handover-notes--bridging-sessions) over rushing into the next step. - -### `/close` — Clean Finish - -`/close` has two jobs: (1) capture the session's knowledge before it's lost, and (2) maintain the persistent state so the next session can pick up cleanly. - -It may be called **multiple times per session** — once per atomic step. On the final invocation, it also updates tracking and suggests next steps. - -1. **Identify the task** — Match the current work to the task tracking entry. -2. **Distill context** — Review what happened during this step and construct a summary: what was the goal, what approach was taken and why, what alternatives were considered, what was tested, what risks remain. This doesn't need to be exhaustive — a few sentences covering the key points. But it must be *real*, derived from the actual session, not generic filler. Each step gets its *own* context — don't bundle reasoning from step 1 into step 3. - - **Example** — what a good summary looks like: - ``` - "Added LRU caching to KnowledgeLoader because loading evaluation criteria - from disk on every preflight call was adding ~200ms. Chose LRU over TTL - because the data files are static within a session. Tested with empty cache, - full cache, and cache invalidation on file change. Risk: cache is not - invalidated if data files are edited mid-session, but this is acceptable - since data files only change between releases." - ``` - -3. **Delegate to integrations** — Hand off the distilled context to any integrated workflows that act at step completion. Currently: invoke `/commit` with the summary as `$ARGUMENTS`. (See [Git Workflow integration](#git-workflow).) -4. **If more steps remain** — Return to implementation. -5. **If final step** — Update the persistent state: - - Move the backlog file to `archive/` - - Remove the task from the priority queue - - Add it to the "Done" section in `task-tracking.md` - - Unblock any dependent tasks -6. **Suggest next step** — Based on the priority queue and current state, recommend what to do next. (When git integration is active, also consider branch state via `scripts/branch-context.sh` to suggest whether to continue on the branch or open a PR first.) - -## Integrations - -The task workflow operates standalone but is designed to integrate with other workflows that benefit from structured task context and session knowledge. Each integration hooks into the task lifecycle at defined points — typically at session start (conversational workflow) and step completion (`/close`) — receiving context that the integrated workflow cannot derive on its own. - -The pattern is always the same: the task workflow captures knowledge during the session (the "why" — motivation, decisions, trade-offs) and hands it to the integrated workflow, which can only derive the "what" (diffs, outputs, artifacts) mechanically. This handover is what makes integrated workflows produce richer output than they could in isolation. - -### Git Workflow - -The [Git Workflow](./git-workflow.md) is the first integration. It handles branching, commits, and PRs as a standalone workflow with its own skills (`/branch`, `/commit`, `/pr`). The task workflow enhances it by providing the reasoning context that git operations cannot derive from diffs and logs alone. - -**Hook: Task startup → `/branch`** — When the conversational workflow loads a task (step 3), it invokes `/branch` with the task title and thematic area as `$ARGUMENTS`, so `/branch` can decide autonomously whether to create a new branch, continue on the current one, or escalate. - -Good: `Skill(skill="branch", args="Add LRU caching to KnowledgeLoader — performance optimization in the knowledge module")` -Avoid: `Skill(skill="branch", args="CRA-007")` — a task ID tells `/branch` nothing about the theme. - -**Hook: `/close` → `/commit`** — When `/close` finalizes a step (step 3), it invokes `/commit` with the distilled session context as `$ARGUMENTS`: the goal, approach and reasoning, alternatives considered, what was tested, and known risks. This is what makes commit messages explain *why*, not just *what*. - -**Hook: `/close` → `/pr` (suggested)** — `/close` doesn't invoke `/pr` directly. On the final step, it reads the branch state and may suggest running `/pr` if the branch is ready for review. - -For the full context contracts and tier system, see [Git Workflow — The Git Skills](./git-workflow.md#the-git-skills). - -## See Also - -- [Git Workflow](./git-workflow.md) — The standalone git workflow (branching, commits, PRs, quality gates) -- `.claude/skills/close/SKILL.md` — Operational details for `/close` diff --git a/scripts/branch-context.sh b/scripts/branch-context.sh deleted file mode 100755 index 2c1a3f1..0000000 --- a/scripts/branch-context.sh +++ /dev/null @@ -1,89 +0,0 @@ -#!/usr/bin/env bash -# Branch context: structured snapshot of the full branch landscape. -# Used by the /branch skill to evaluate fit in one call. -# Output is labeled and consistent — the AI parses it to make branching decisions. -# -# What this script does (in order): -# 1. Fetches from remote (so origin/main is current) -# 2. Cleans up stale branches (merged into main) -# 3. Reports current branch state -# 4. Lists surviving WIP branches with their commits and changed files - -set -euo pipefail - -# ─── Step 1: Fetch from remote ─────────────────────────────────────────────── -git fetch origin --quiet 2>/dev/null || echo "fetch-warning: could not reach remote" - -# ─── Step 2: Clean up stale branches ───────────────────────────────────────── -# Stale = all commits already in main (git branch --merged main), excluding main itself. -STALE=$(git branch --merged main 2>/dev/null | grep -v '^\*' | grep -vE '^\s*(main|master)\s*$' | sed 's/^[* ]*//' || true) - -if [ -n "$STALE" ]; then - echo "cleaned-up:" - for BRANCH in $STALE; do - git branch -d "$BRANCH" >/dev/null 2>&1 && echo " - $BRANCH" || echo " - $BRANCH (failed to delete)" - done -else - echo "cleaned-up: none" -fi - -# ─── Step 3: Current branch state ──────────────────────────────────────────── -CURRENT=$(git branch --show-current) -echo "branch: $CURRENT" - -# Clean or dirty working tree -if [ -z "$(git status --porcelain)" ]; then - echo "clean: yes" -else - echo "clean: no" -fi - -# On main or feature branch? -if [ "$CURRENT" = "main" ] || [ "$CURRENT" = "master" ]; then - echo "status: on-main" -else - echo "status: on-feature" - - # Pushed to remote? - if git rev-parse --verify "origin/$CURRENT" >/dev/null 2>&1; then - echo "pushed: yes" - else - echo "pushed: no" - fi - - # Commits on this branch since main - COMMIT_COUNT=$(git rev-list --count main..HEAD 2>/dev/null || echo "0") - echo "commit-count: $COMMIT_COUNT" - echo "commits:" - git log main..HEAD --format=" - %s" 2>/dev/null || echo " (none)" - - # Files changed on this branch since main - echo "files-changed:" - git diff --name-only main..HEAD 2>/dev/null | sed 's/^/ /' || echo " (none)" -fi - -# ─── Step 4: WIP branches (other feature branches with unmerged work) ──────── -# List all local branches except main/master and the current branch. -WIP_BRANCHES=$(git branch 2>/dev/null | grep -v '^\*' | grep -vE '^\s*(main|master)\s*$' | sed 's/^[* ]*//' || true) - -if [ -n "$WIP_BRANCHES" ]; then - echo "wip-branches:" - for WIP in $WIP_BRANCHES; do - WIP_COUNT=$(git rev-list --count main.."$WIP" 2>/dev/null || echo "0") - # Check if pushed - if git rev-parse --verify "origin/$WIP" >/dev/null 2>&1; then - WIP_PUSHED="yes" - else - WIP_PUSHED="no" - fi - echo " - name: $WIP" - echo " pushed: $WIP_PUSHED" - echo " commit-count: $WIP_COUNT" - echo " commits:" - git log main.."$WIP" --format=" - %s" 2>/dev/null || echo " (none)" - echo " files-changed:" - git diff --name-only main.."$WIP" 2>/dev/null | sed 's/^/ /' || echo " (none)" - done -else - echo "wip-branches: none" -fi