test(b5): Domain E RED tests — apps/perf/lint polish [F#3 F#8 F#9 F#13 F#16] by SyncTekLLC · Pull Request #148 · SyncTek-LLC/simdrive

SyncTekLLC · 2026-05-22T19:10:23Z

Summary

RED test suite for SimDrive b5 Domain E dogfood findings. 14 tests fail on HEAD, 3 pass as shape-preservation anchors.

F#3 — apps() must read CFBundleShortVersionString from app's Info.plist when simctl listapps plist omits it; fallback to build when absent from both
F#8 — tap(verify_change=True) must return screen_changed: bool and ssim_delta: float after capturing pre/post screenshots
F#9 — perf.snapshot() must sample CPU over a 200 ms window and return sample_window_ms field instead of an instant 0.0 value
F#13 — list_replays() must accept min_steps param (default=1) to filter 0-step placeholder recordings
F#16 — LintResult needs a category field; 0-step recordings must be 'empty' not 'fail: no requires block'

Test plan

pytest tests/test_b5_domain_e_apps_perf_lint.py -m "not live" — all 14 RED tests fail on HEAD
After CodeAtlas implements each finding, corresponding tests go GREEN
3 passing shape-anchor tests must remain passing throughout

🤖 Generated with Claude Code

SyncTekLLC · 2026-05-22T20:22:49Z

Coverage filled in via new test file simdrive/tests/test_b5_domain_e_coverage.py (17 tests, production code unchanged).

Before: 89.21% (below --fail-under=90 gate)
After: 91.05% (gate passes)

New tests target the specific new production paths from F#3/F#8/F#9/F#13/F#16:

TestComputeSsim (9 tests) — covers _compute_ssim body in server.py (lines 815-874): None paths, missing files, non-PNG bytes, RGB/RGBA PNGs, size mismatches
TestToolTapVerifyChange (3 tests) — covers verify_change block in tool_tap (server.py 1169-1173)
TestLintOneOsError (2 tests) — covers OSError branch in _lint_one (recorder.py 843-844)
TestLintResultCategoryField (3 tests) — exercises to_dict() for the new F#16 category field

All 17 new tests pass. Existing test suite delta: 1529 → 1546 passed, 2 pre-existing failures in test_cloud_quotas.py (unrelated to this PR).

14 failing RED tests covering Domain E dogfood findings: - F#3: apps() reads CFBundleShortVersionString from Info.plist when simctl omits it - F#8: tap verify_change=True returns screen_changed bool + ssim_delta float - F#9: perf.snapshot windows CPU over 200ms and returns sample_window_ms field - F#13: list_replays accepts min_steps param; default=1 excludes 0-step placeholders - F#16: LintResult category field; 0-step recordings classified 'empty' not 'fail' Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

F#3: add _read_app_info_plist helper; list_apps() falls back to reading Info.plist from bundle when simctl omits CFBundleShortVersionString; final fallback to build number when both sources lack it. F#8: add _compute_ssim() to server; tool_tap() accepts verify_change=true to capture pre/post screenshots and return screen_changed bool + ssim_delta float; default behaviour (no extra keys) unchanged. F#9: perf.snapshot() now samples CPU over ~200 ms window (3 samples), averages them, and returns sample_window_ms=200 in the result dict. F#13: list_replays() accepts min_steps=1 default; 0-step placeholders filtered out unless caller passes min_steps=0. F#16: LintResult gains category field; 0-step recordings with no requires block get status='empty'/category='empty' instead of 'fail'; real recordings with steps-but-no-requires still get category='missing_state_contract'. Updated stale test_lint_one_missing_requires to match new semantic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…duction lines Add simdrive/tests/test_b5_domain_e_coverage.py (17 tests) to cover the new production paths introduced in the feat(b5) commit that dropped coverage from 90%+ to 89.21%: - TestComputeSsim (9 tests): exercises _compute_ssim — None paths, missing files, non-PNG bytes, identical/different RGB/RGBA PNGs, mismatched dims, empty string paths. Covers server.py lines 815-874. - TestToolTapVerifyChange (3 tests): verify_change=True/False paths in tool_tap with monkeypatched _compute_ssim. Covers server.py 1169-1173. - TestLintOneOsError (2 tests): OSError branch in _lint_one via patched Path.read_text. Covers recorder.py lines 843-844. - TestLintResultCategoryField (3 tests): to_dict() round-trip for category values 'ok', 'empty', 'missing_state_contract' (F#16 field). Covers the category serialisation path. Production code unchanged. CI gate: 89.21% → 91.05%. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

SyncTekLLC marked this pull request as ready for review May 22, 2026 19:22

SyncTekLLC force-pushed the test/b5-domain-e-apps-perf-lint branch from 1cc6f43 to 8bd89ab Compare May 22, 2026 20:22

SyncTekLLC and others added 3 commits May 22, 2026 16:25

SyncTekLLC force-pushed the test/b5-domain-e-apps-perf-lint branch from 8bd89ab to 63128a0 Compare May 22, 2026 20:25

SyncTekLLC merged commit ee280d3 into main May 22, 2026
5 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(b5): Domain E RED tests — apps/perf/lint polish [F#3 F#8 F#9 F#13 F#16]#148

test(b5): Domain E RED tests — apps/perf/lint polish [F#3 F#8 F#9 F#13 F#16]#148
SyncTekLLC merged 3 commits into
mainfrom
test/b5-domain-e-apps-perf-lint

SyncTekLLC commented May 22, 2026

Uh oh!

SyncTekLLC commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SyncTekLLC commented May 22, 2026

Summary

Test plan

Uh oh!

SyncTekLLC commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant