spike(p4_beat): structured Pydantic html: str — 5/5 acceptance (HOM-243)#136
Open
sidorovanthon wants to merge 3 commits into
Open
spike(p4_beat): structured Pydantic html: str — 5/5 acceptance (HOM-243)#136sidorovanthon wants to merge 3 commits into
sidorovanthon wants to merge 3 commits into
Conversation
…M-243)
Reverses the HOM-134 FS-source-of-truth retreat for `p4_beat` only,
gated on a 5/5 acceptance run (see scripts/spike_hom243.py + spec
docs/superpowers/specs/2026-05-10-state-first-artifacts.md §10 Step 0).
- Add `BeatOutput { html: str }` Pydantic schema in nodes/p4_beat.py.
- Wire `output_schema=BeatOutput` + drop `Write` from `allowed_tools`
in `_build_node()`. `result_key` becomes `_beat_html_spike` — fan-out
reducer is OUT OF SCOPE per spec; acceptance inspects the returned
dict directly, production fan-in remains FS-driven.
- `p4_beat_node` no longer pre-creates the scene HTML directory (no
`Write` to land in it).
- Update brief output-shape + Process steps to instruct the sub-agent
to return the scene fragment as the `html` field of the structured
response (single fenced ```json``` block; the orchestrator's
`_schema_extract` accepts the first valid fenced JSON). Canon read
list, hard rules, anti-patterns, palette/typography, density and
exit-pair sections are UNTOUCHED — brief still references canon by
path, never embeds (CLAUDE.md §"Decomposition via brief-references-canon").
- Refresh `tests/snapshots/briefs/p4_beat.txt`.
- Bump `_CACHE_VERSION` 7 → 8 with HOM-243 rationale comment.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…HOM-243)
Measurement script for the HOM-243 pre-migration spike. Runs N (default 5)
sequential paid dispatches of `p4_beat` against the canonical fixture's
`hook` beat, records per-attempt success/html_chars/retry_count/exception/
wall_time, writes summary to `docs/spikes/hom-243-results.json`, exits
non-zero on acceptance failure (5/5 success + every html_chars >= 5_000 +
every retry_count == 0).
State is reconstructed from the committed fixture cache.db via raw SQLite
+ LangGraph's `JsonPlusSerializer` (same path as
`tests/_helpers/replay_dispatch`) — no graph runtime, no LLM cache hit
risk. The dispatch goes straight through `LLMNode.__call__` so the
production `CachePolicy` is bypassed by construction; every iteration is
a real paid call.
DO NOT execute under replay/$0 conditions. Operator authorisation
required (~$10 cap). Invocation:
$env:HOMESTUDIO_PROJECT_ROOT = "$PWD\tests\fixtures"
$env:PYTHONPATH = "graph\src"
graph\.venv\Scripts\python.exe scripts\spike_hom243.py
`docs/spikes/README.md` documents the spike directory convention.
The results JSON is intentionally NOT committed in this PR — operator
populates after the paid run.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pilot: 1/1, html=7238 chars, 0 retries, 140s. Full: 5/5, html ∈ [5524, 7022] chars, 0 retries, 72-134s/dispatch. Tier: claude-opus-4-7 (production "expensive"). Total spend ≈ $9 (6 dispatches at ~$1.50). No SchemaValidationError, no truncation, no JSON-escape corruption observed across any of 6 paid attempts. Acceptance gate per spec docs/superpowers/specs/2026-05-10-state-first-artifacts.md §10 Step 0 — PASSES. HOM-230 epic proceeds with state-channel storage path (HOM-231..242). BaseStore §6.0 fallback is NOT taken. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sidorovanthon
pushed a commit
that referenced
this pull request
May 10, 2026
…0 status - §10 Step 0: add 2026-05-10 status note recording HOM-243 spike PASS (6/6 paid dispatches, 0 SchemaValidationError, html ∈ [5524, 7238]). Re-trigger condition documented (tier downgrade or structural brief mutation). - §10b: rename "sub-issue HOM-243 candidate" → "future sub-issue (number TBD)" — HOM-243 is the spike PR #136, not the atomic-record protocol. - Section numbering: orphaned sizing-budget table now lives under a proper "## 11. Sizing budget" heading; cascading renumber of the duplicate §12 ("Open questions" → §13), §13 → §14 (Acceptance), §14 → §15 (References), §15 → §16 (CLAUDE.md amendments). Cross-reference on line 119 updated (§13 → §11). Per PR #135 review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sidorovanthon
added a commit
that referenced
this pull request
May 10, 2026
…utputs (#135) * docs(spec): state-first artifacts — single-source-of-truth for node outputs * docs(spec): revise state-first-artifacts per independent review (HOM-230) - Honestly scope to 4/7 incidents; flag schema-evolution + process-discipline as orthogonal - Add §6.0 Considered alternatives — evaluate LangGraph BaseStore vs state-channel - Add pre-migration spike on p4_beat structured output (HOM-134 prior) - Atomic-record protocol as complementary fix - Split step D into D1 (read switch) + D2 (strip dual-write + git rm) - Move compose.scenes reducer + test into step A - Add §Risks & rollback section * docs(spec): apply review nits — HOM-243 collision, §-numbering, Step 0 status - §10 Step 0: add 2026-05-10 status note recording HOM-243 spike PASS (6/6 paid dispatches, 0 SchemaValidationError, html ∈ [5524, 7238]). Re-trigger condition documented (tier downgrade or structural brief mutation). - §10b: rename "sub-issue HOM-243 candidate" → "future sub-issue (number TBD)" — HOM-243 is the spike PR #136, not the atomic-record protocol. - Section numbering: orphaned sizing-budget table now lives under a proper "## 11. Sizing budget" heading; cascading renumber of the duplicate §12 ("Open questions" → §13), §13 → §14 (Acceptance), §14 → §15 (References), §15 → §16 (CLAUDE.md amendments). Cross-reference on line 119 updated (§13 → §11). Per PR #135 review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: anticodeguy <anticodeguy@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
HOM-243 pre-migration spike. Hard-gates the HOM-230 state-first-artifacts epic (HOM-231..242) — 5/5 paid dispatches must succeed with structured-output extraction on ~30 KB body before the migration plan proceeds.
Outcome: PASS. Migration proceeds with state-channel storage; the §6.0 BaseStore fallback is NOT taken.
Results — 6/6 paid dispatches succeeded
Evidence: `docs/spikes/hom-243-results.json`, `docs/spikes/hom-243-pilot.log`, `docs/spikes/hom-243-full.log`.
Code changes
Production-break warning — DO NOT MERGE before HOM-231 lands
`p4_beat` no longer writes `scene_html_path` to disk, but `p4_assemble_index.py:588` still reads it via `read_text`. Merging before Step A (`compose.scenes` reducer) + Step B (assemble switches to state read) would break Phase 4 production. The spec calls for spike PR to land as Step 0; in practice this PR's evidence value is `docs/spikes/` and the schema/brief edits should be either:
Defer merge decision to operator. Evidence stands either way.
Test plan
🤖 Generated with Claude Code