Skip to content

feat(judge): retain per-platform/per-layer detail in JudgeResult#73

Merged
dadachi merged 1 commit into
mainfrom
feat/judge-platform-detail
May 22, 2026
Merged

feat(judge): retain per-platform/per-layer detail in JudgeResult#73
dadachi merged 1 commit into
mainfrom
feat/judge-platform-detail

Conversation

@dadachi
Copy link
Copy Markdown
Contributor

@dadachi dadachi commented May 22, 2026

What

Step 1 of the HTML validation report (docs/validation-report.md).

runJudge already computed full Layer 1 findings and Layer 2 command/duration/stderrTail inside evaluate(), then discarded everything but counts and a summary string — so the report data never reached the caller. This PR widens evaluate() to return the full Layer1Result + Layer2Result and surfaces them on a new field.

Changes

  • JudgeResult gains an optional, additive platforms?: readonly PlatformDetail[].
  • New PlatformDetail type: layer1 (pass + findings), layer2 (pass, command, mode, exitCode, durationMs, stderrTail), and the matching layer3 VisualJudgePlatformReport for ios/android (rails has none).
  • evaluate() returns the full layer results; trace lines + pass aggregates updated to the new shape (no behavior change to logs or summary).

Compatibility

  • Additive only. Existing consumers (CLI summary line, MCP generate_app) ignore the new field.
  • The stub judge path omits platforms (unchanged).
  • npm run ci green locally — build + typecheck + 41/41 tests pass.

Renderer + report.json emission come in follow-up PRs (steps 3–7 in the spec).

🤖 Generated with Claude Code

Step 1 of the HTML validation report (docs/validation-report.md).

runJudge already computed full Layer 1 findings and Layer 2 command/
duration/stderr inside evaluate(), then discarded everything but counts
and a summary string. Widen evaluate() to return the full Layer1Result
+ Layer2Result, and surface them on a new optional, additive
JudgeResult.platforms (readonly PlatformDetail[]).

Each PlatformDetail carries layer1 (pass + findings), layer2 (pass,
command, mode, exitCode, durationMs, stderrTail), and the matching
layer3 VisualJudgePlatformReport for ios/android (rails has none).

Back-compatible: existing consumers (CLI summary, MCP) ignore the new
field; the stub judge path omits it. This is the structured data the
validation report renderer will consume in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dadachi dadachi merged commit 8aa6fd5 into main May 22, 2026
1 check passed
@dadachi dadachi deleted the feat/judge-platform-detail branch May 22, 2026 09:14
dadachi added a commit that referenced this pull request May 22, 2026
#74)

Implements steps 3-7 of docs/validation-report.md (step 1-2 shipped in
#73). Every run now writes out/<slug>/report.json (machine-readable
RunReport) and validation-report.html (self-contained, screenshots
base64-embedded) summarizing all validation layers + the reviewer.

- src/report/model.ts   — RunReport / RunMeta / RepairAttempt / AssetMap
- src/report/render.ts   — renderReport(report, assets): pure, no I/O;
                           gate strip, platform×layer matrix, Layer 1
                           findings, Layer 2 stderr, Layer 3 screenshots
                           + rubric rationales (incl. Stage 2 filmstrip),
                           reviewer diff, rename-plan appendix, reproduce
                           footer. HTML-escaped throughout.
- src/report/theme.ts    — inline brand CSS (matches social-preview)
- src/report/collect.ts  — buildRunReport (pure assembly), resolveAssets
                           (embed as data: URI, or copy to report-assets/),
                           writeReport (report.json + HTML)
- src/version.ts         — shared readPackageVersion (dedups mcp.ts copy)
- dispatch: records timing, assembles + writes the report, returns
  DispatchResult = JudgeResult & { report, reportPaths }. Writing
  defaults off in stub mode so the test suite never touches ./out.
- index: --no-report / --report-format / --report-embed / --report-open
  flags; prints a file:// path. mcp: returns report + html/json paths in
  structuredContent.

Tests: pure renderer assertions (findings, stderr, diff, rename plan,
HTML escaping, empty-platforms summary fallback), writeReport round-trip
(self-contained — asserts no tmp/ paths and no external URLs leak),
embed=false externalization, and buildRunReport assembly. 46/46 pass.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dadachi added a commit that referenced this pull request May 22, 2026
…kflow (#76)

Adds section 10 to docs/validation-report.md covering how to render and
visually verify validation-report.html:

- Path A — Playwright MCP (interactive, inside Claude Code): local-scope
  `claude mcp add playwright`, restart to load tools, then browser_navigate
  (file:// URL) + browser_take_screenshot. Note it's a maintainer
  convenience, not part of the pipeline (device shots use mobile-mcp).
- Path B — Playwright CLI (headless, no MCP, CI-friendly):
  `npx -y playwright@latest screenshot --full-page file://… report.png`,
  no project dependency added.

Also marks section 9 as shipped (#73/#74/#75).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant