Skip to content

Python fixture parity for the 6-8 dead-baseline equivalence cases #215

@k-yoshimi

Description

@k-yoshimi

Background

#213 was resolved on 2026-05-26 by declaring equivalence tests Linux-canonical (see spec/policy at `docs/superpowers/specs/2026-05-26-linux-canonical-equiv-policy-design.md` and `docs/baseline-policy.md`). The original platform-keyed-baselines approach (per-platform `/-gcc/metrics.json`) was rejected because of two structural prerequisites that need to close first; this issue tracks one of them.

The gap

6-8 of 20 baselines under `test_run/baselines/` lack a corresponding Python fixture (`_params.py`) under `python//tests/fixtures/`. They are generated only by the standalone-Fortran-binary regen workflow (`test_run/run_tests.sh` → `*regress.dat` → `extract_metrics.py` → `metrics.json`). On Linux clavius the workflow works; on macOS the standalone binaries don't link (no graphics libs).

Inventory (from develop @ 06b4055):

Module Baseline cases (committed) Python fixtures (`*_params.py`) Gap
eqlib iter01, jt60, tst2 iter01, tst2 eq_jt60
fplib dt1, iter01, jt60 dt1, iter01 fp_jt60
tilib ar, min, w ar, iter01 (orphan baseline-less) ti_min, ti_w
trlib iter01, m0904, tst2 iter01, tst2 tr_m0904
wrxlib demo, iter01, jt60 demo, iter01 wrx_jt60
totlib demo2014_short, ht6m_short demo2014, ht6m tot_*_short name-mismatch — needs resolution
wrlib iter_lhcd, test001, tst2_ec all 3 ✓ (none — full coverage)
eqlib + … ti_iter01 (orphan fixture — no baseline)

Total gap: ~6 missing fixtures + 1 orphan fixture + 1 name-mismatch resolution.

Why these cases are dead

They appear in `test_run/baselines/` but no `test_equivalence.py` exercises them via Python. Linux CI `pytest python/` whole-tree run at `python-tests.yml:323` never touches them. They're committed but unverified at the Python wrapper level (verified only by clavius regen runs).

Closing the gap

For each missing case, add `python//tests/fixtures/_params.py` with an `apply(<lib_instance>)` function that replays the namelist parameter set (transcribe from `test_run/inputs/.in` plus any `.{eqparm,trparm,...}` companion files). Then add the case to the module's `test_equivalence.py` `CASES` dict.

For `tot_*_short` name mismatch: decide whether the Python fixtures should be renamed to add `_short`, the baselines renamed to drop it, or whether they're truly different cases that need separate fixtures.

For `ti_iter01` orphan fixture: either generate a baseline for it (Linux clavius), or delete the fixture.

Why this matters

This is the prerequisite for revisiting the platform-keyed-baselines design if the project ever needs macOS (or other non-Linux) as a canonical equivalence-test platform. The other prerequisite is the graphics-stubs gap (`_static_stubs.f90` for fp/ti/tr/eq/wr/wrx so standalone binaries can build on macOS) — that's a Phase-L-sized concern and is OUT OF SCOPE for this issue.

Out of scope

  • The graphics-stubs gap (separate Phase-L follow-up).
  • Reviving the platform-keyed-baselines design (premature until BOTH gaps close — see `docs/superpowers/specs/2026-05-26-platform-keyed-baselines-design.md` SUPERSEDED banner).
  • Tolerance changes.
  • CI changes.

References

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions