Skip to content

testsuite: WAST runner assertion counts are non-deterministic (~18/run) — can't gate conformance on the number #360

Description

@avrabe

Two full runs of the IDENTICAL tree report different WAST assertion totals: 65776 pass vs 65794 pass (~18 vary run-to-run); total executed assertions also drift (66325 vs 66307). So before/after assertion-count comparison is unreliable — a real change's delta (+2 for #359) is smaller than the ±18 run-to-run noise, which can mask or fake a regression.

The FILE-level result is stable: Files: N pass/M fail and the failing-file SET reproduce. The #149 no-regression check had to be done by diffing failing-file sets (comm), not counts:

files failing AFTER but not BASE: (none)              # no regression
files failing BASE but not AFTER: type-subtyping.wast # the fix

Impact: blocks gating conformance in CI on the assertion count (not reproducible); slowed #149 verification (multiple ~20-min full runs + set-diffing).

Suspected cause: order/state-dependence (a module failing instantiation changes how many downstream assert_returns execute) and/or runner parallelism.

Asks: (1) make the executed-assertion count deterministic (fixed order, isolated per-module state), and/or (2) emit a stable machine-readable per-file report (--wast-report JSON) so CI gates on the file-level set/counts (already stable). Surfaced while landing #359 (GC call_indirect subtyping fix, #149).

Metadata

Metadata

Assignees

No one assigned

    Labels

    tool-frictionFriction hit while using the tool for real work (dogfooding signal)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions