Skip to content

Research: harness-layer leverage points (spatiotemporal context, PTC, sub-agent isolation, skills-as-config) #10

@datashaman

Description

@datashaman

Track research into four harness-layer ideas raised by @yyz81681981 on X. The claim is that the real leverage in agent systems is not at the model/prompt layer but at the harness — i.e. the surrounding scaffolding that runs the model. Worth a structured investigation: which of these are already in Claude Code / our executors, which are missing, and which are worth pulling into Specify.

The four ideas

  1. Spatiotemporal management of contextspatial (what the agent sees right now in working memory) vs temporal (what it has seen across the run / session). Distinct retrieval policies for each.
  2. PTC execution model — Plan → Tool → Critique loop instead of monolithic agent turns. Critique is a first-class phase, not a final-step ask.
  3. Sub-agent isolation + proactive memory — likened to "Iris's dreaming cycle". Sub-agents run isolated; their memories are consolidated/promoted back to the parent rather than streamed inline.
  4. Constraint files / Skills as config — supply behaviour via loadable skill/constraint files rather than rewriting the system prompt every time.

Source

The real leverage is at the Harness layer:

  • Spatiotemporal management of context (spatial context vs. temporal context)
  • PTC execution model (Plan → Tool → Critique)
  • Sub-agent isolation + proactive memory (similar to Iris's dreaming cycle)
  • Constraint files / Skills as config, rather than rewriting the system prompt every time

https://x.com/yyz81681981

Why this matters for Specify

Specify already has executor-level harness (`app/Services/Executors/`), context builders (`app/Services/Context/`), and a prompt loader (`app/Services/Prompts/PromptLoader.php`). Mapping each of the four ideas onto those surfaces is the cheapest way to find out which gaps are real.

  • Spatial vs temporal maps onto our `RecencyContextBuilder` — currently one builder; the spatial/temporal split would be two.
  • PTC could become an Executor capability flag (cf. ADR-0003 `supports*` flags); a critique phase between the agent's diff and the PR open is a natural insertion point.
  • Sub-agent isolation lines up with the race-mode siblings (ADR-0006). Promoting sub-agent memory back into the parent is closer to ADR-0005's append-only plan growth than it looks.
  • Skills as config lines up with our `prompts/` directory and the planned context-brief work (ADR-0011 progress events). The Anthropic CC harness already does this with `~/.claude/skills/` — there's a working reference.

Investigation tasks

  • Map each idea to Specify's existing harness surfaces. For each: where is the closest analogue today, what's missing, what's the smallest viable insertion?
  • Find primary sources for the source claims. The X post is a summary; locate the underlying work (Iris reference, PTC literature, etc.) before designing.
  • Pull existing implementations to compare: how do Claude Code's skills + sub-agents behave, what does Cursor / Continue / Aider do at the harness layer?
  • Write up which of the four ideas are worth ADR-grade work for Specify and which are out-of-scope or already covered. One short doc with go / defer / skip per idea.
  • If any idea promotes to ADR: open a follow-up issue with a concrete proposed shape.

Out of scope

  • No code changes from this issue. This is the research scaffold; concrete proposals fork off into their own ADRs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions