Summary
test_run/baselines/tr_tst2/metrics.json is missing the AJRFT scalar key. After commit 24b1b12f ("test+tr: backfill AJRFT in regression dump") + the libtrapi.so Python wrapper update from PR #187, libtrapi.so emits 14 scalars (including AJRFT) but the tst2 baseline still has 13. Whenever eq_tst2 is fixed and tr_tst2 runs in CI, the Layer-1 equivalence test will fail with:
compare_metrics FAIL: scalars.AJRFT: missing
Why this wasn't fixed in the original AJRFT PR
tr_tst2 depends on eq_tst2, which has its own pre-existing ~3e-9 physics drift on the current develop branch (eq_tst2 regression on clavius 2026-05-10 — likely upstream eq changes since the baseline was last regenerated). Regenerating the tr_tst2 baseline requires either:
- Resolving the eq_tst2 drift first, OR
- Manually patching just the
AJRFT: 0.0 line (risky — assumes the actual value is 0; in-house code review on commit 24b1b12f explicitly cautioned against this)
The choice was made to leave tst2 with the existing stale baseline (which is the status quo — the test SKIPs locally due to missing eqdata, hiding the issue) and address it as a separate concern.
Acceptance
- Resolve the eq_tst2 drift (separate work — investigate which develop change introduced the ~3e-9 shift; eq physics or eq baselines stale?)
- Regenerate the eq_tst2 baseline on a Linux box with current eq (see
\$CLAUDE_MEMORY_DIR/memory/reference_clavius_baseline_regen.md)
- Then regenerate the tr_tst2 baseline (will include AJRFT automatically via the fix in
24b1b12f)
- Verify
pytest python/trlib/tests/test_equivalence.py --forked --timeout=120 --timeout-method=signal -v reports both test_iter01 AND test_tst2 PASSED at 1e-10 — no SKIP
Related
🤖 Filed via post-push retrospective on 2026-05-11.
Summary
test_run/baselines/tr_tst2/metrics.jsonis missing theAJRFTscalar key. After commit24b1b12f("test+tr: backfill AJRFT in regression dump") + the libtrapi.so Python wrapper update from PR #187,libtrapi.soemits 14 scalars (includingAJRFT) but the tst2 baseline still has 13. Whenevereq_tst2is fixed andtr_tst2runs in CI, the Layer-1 equivalence test will fail with:Why this wasn't fixed in the original AJRFT PR
tr_tst2depends oneq_tst2, which has its own pre-existing ~3e-9 physics drift on the current develop branch (eq_tst2regression on clavius 2026-05-10 — likely upstream eq changes since the baseline was last regenerated). Regenerating the tr_tst2 baseline requires either:AJRFT: 0.0line (risky — assumes the actual value is 0; in-house code review on commit24b1b12fexplicitly cautioned against this)The choice was made to leave tst2 with the existing stale baseline (which is the status quo — the test SKIPs locally due to missing eqdata, hiding the issue) and address it as a separate concern.
Acceptance
\$CLAUDE_MEMORY_DIR/memory/reference_clavius_baseline_regen.md)24b1b12f)pytest python/trlib/tests/test_equivalence.py --forked --timeout=120 --timeout-method=signal -vreports bothtest_iter01ANDtest_tst2PASSED at 1e-10 — no SKIPRelated
24b1b12f(chore/pre-push-hook-worktree-compat)feedback_scalar_field_triangle.md,feedback_equivalence_must_pass.md🤖 Filed via post-push retrospective on 2026-05-11.