Track research into four harness-layer ideas raised by @yyz81681981 on X. The claim is that the real leverage in agent systems is not at the model/prompt layer but at the harness — i.e. the surrounding scaffolding that runs the model. Worth a structured investigation: which of these are already in Claude Code / our executors, which are missing, and which are worth pulling into Specify.
The four ideas
- Spatiotemporal management of context — spatial (what the agent sees right now in working memory) vs temporal (what it has seen across the run / session). Distinct retrieval policies for each.
- PTC execution model — Plan → Tool → Critique loop instead of monolithic agent turns. Critique is a first-class phase, not a final-step ask.
- Sub-agent isolation + proactive memory — likened to "Iris's dreaming cycle". Sub-agents run isolated; their memories are consolidated/promoted back to the parent rather than streamed inline.
- Constraint files / Skills as config — supply behaviour via loadable skill/constraint files rather than rewriting the system prompt every time.
Source
The real leverage is at the Harness layer:
- Spatiotemporal management of context (spatial context vs. temporal context)
- PTC execution model (Plan → Tool → Critique)
- Sub-agent isolation + proactive memory (similar to Iris's dreaming cycle)
- Constraint files / Skills as config, rather than rewriting the system prompt every time
— https://x.com/yyz81681981
Why this matters for Specify
Specify already has executor-level harness (`app/Services/Executors/`), context builders (`app/Services/Context/`), and a prompt loader (`app/Services/Prompts/PromptLoader.php`). Mapping each of the four ideas onto those surfaces is the cheapest way to find out which gaps are real.
- Spatial vs temporal maps onto our `RecencyContextBuilder` — currently one builder; the spatial/temporal split would be two.
- PTC could become an Executor capability flag (cf. ADR-0003 `supports*` flags); a critique phase between the agent's diff and the PR open is a natural insertion point.
- Sub-agent isolation lines up with the race-mode siblings (ADR-0006). Promoting sub-agent memory back into the parent is closer to ADR-0005's append-only plan growth than it looks.
- Skills as config lines up with our `prompts/` directory and the planned context-brief work (ADR-0011 progress events). The Anthropic CC harness already does this with `~/.claude/skills/` — there's a working reference.
Investigation tasks
Out of scope
- No code changes from this issue. This is the research scaffold; concrete proposals fork off into their own ADRs.
Track research into four harness-layer ideas raised by @yyz81681981 on X. The claim is that the real leverage in agent systems is not at the model/prompt layer but at the harness — i.e. the surrounding scaffolding that runs the model. Worth a structured investigation: which of these are already in Claude Code / our executors, which are missing, and which are worth pulling into Specify.
The four ideas
Source
— https://x.com/yyz81681981
Why this matters for Specify
Specify already has executor-level harness (`app/Services/Executors/`), context builders (`app/Services/Context/`), and a prompt loader (`app/Services/Prompts/PromptLoader.php`). Mapping each of the four ideas onto those surfaces is the cheapest way to find out which gaps are real.
Investigation tasks
Out of scope