diff --git a/.github/workflows/testsuite-repin.yml b/.github/workflows/testsuite-repin.yml new file mode 100644 index 00000000..a7cd8031 --- /dev/null +++ b/.github/workflows/testsuite-repin.yml @@ -0,0 +1,80 @@ +name: Testsuite Re-pin & Conformance Baseline + +# Regularly re-pins external/testsuite (WebAssembly/testsuite) to latest upstream +# and re-measures WAST conformance, opening a PR for review. +# +# Why: the submodule must stay PINNED (committed gitlink == checked-out commit) +# so every checkout / CI run / git worktree exercises the SAME suite — otherwise +# conformance numbers silently diverge between environments (this confound was +# kiln#360: main had 7e0b83a checked out while the gitlink recorded a different +# commit, so a fresh worktree tested a different suite). A fixed old pin goes +# stale against the evolving spec, so this job moves the pin forward on a cadence +# and surfaces the new conformance baseline for human review (spec evolution can +# add failing files, so the new baseline must be reviewed, not auto-merged). + +on: + schedule: + - cron: '0 4 1 * *' # monthly, 1st @ 04:00 UTC + workflow_dispatch: + +permissions: + contents: write + pull-requests: write + +jobs: + repin: + name: Re-pin testsuite + re-measure conformance + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v5 + with: + submodules: recursive + + - name: Install Rust + uses: dtolnay/rust-toolchain@stable + + - name: Bump testsuite to latest upstream + id: bump + run: | + cd external/testsuite + git fetch origin + git checkout origin/main + echo "commit=$(git rev-parse HEAD)" >> "$GITHUB_OUTPUT" + echo "short=$(git rev-parse --short HEAD)" >> "$GITHUB_OUTPUT" + + - name: Run WAST conformance (legacy excluded by the runner) + id: conf + continue-on-error: true + run: | + cargo run --release -p cargo-kiln -- testsuite --run-wast \ + --wast-dir external/testsuite | tee conformance.txt + { + echo "summary<> "$GITHUB_OUTPUT" + + - name: Open re-pin PR + uses: peter-evans/create-pull-request@v7 + with: + add-paths: external/testsuite + branch: chore/testsuite-repin + delete-branch: true + title: "chore(testsuite): re-pin to ${{ steps.bump.outputs.short }} + conformance re-baseline" + commit-message: | + chore(testsuite): re-pin external/testsuite to ${{ steps.bump.outputs.short }} + + Automated re-pin to latest WebAssembly/testsuite + conformance + re-measure. Review the new baseline before merging. See kiln#360. + body: | + Automated re-pin of `external/testsuite` to latest upstream + (`${{ steps.bump.outputs.commit }}`), keeping the submodule pinned so + CI / worktrees run a reproducible suite (kiln#360). + + **WAST conformance against the new suite:** + ``` + ${{ steps.conf.outputs.summary }} + ``` + + Review the delta vs the previous baseline before merging, and update + the conformance note in `safety/requirements/architecture-components.yaml`. diff --git a/external/testsuite b/external/testsuite index c337f0da..193e551f 160000 --- a/external/testsuite +++ b/external/testsuite @@ -1 +1 @@ -Subproject commit c337f0da6477acd40fbcab98671a68f59106ad86 +Subproject commit 193e551ff22663995b1ac95dc62344133669e14b diff --git a/safety/requirements/architecture-components.yaml b/safety/requirements/architecture-components.yaml index a40e7edc..87b38a6a 100644 --- a/safety/requirements/architecture-components.yaml +++ b/safety/requirements/architecture-components.yaml @@ -42,7 +42,7 @@ artifacts: fields: previous-id: ARCH_COMP_001 implementation: kiln-runtime/src/stackless/engine.rs - note: "Infrastructure implemented; 332 WAST assertion failures remain across 74/280 files (verified 2026-06-13, testsuite @7e0b83a), down from ~1,500. Dominated by GC subtyping (type-subtyping, br_on_cast×4, i31, struct/array, ref_*), linking/imports (instance, imports*), tables/elem. Issue 149; unreached-invalid excess-value cases are Issue 146." + note: "Infrastructure implemented. Conformance baseline (measured from source; testsuite re-pinned to WebAssembly/testsuite @193e551, 2026-06-17): 263/281 files pass; 65980 assertions pass / 529 fail. Remaining failures concentrated in 17 files, dominated by the GC custom-descriptors proposal (proposals/custom-descriptors/*) + br_on_cast/br_on_null/return_call_ref (Issue 149). type-subtyping.wast now passes (PR #359). The submodule is PINNED (gitlink == checkout) and moved forward on a cadence by .github/workflows/testsuite-repin.yml so the number is reproducible across CI/worktrees (kiln#360). The earlier '332 across 74 files @7e0b83a' figure was stale and measured against a divergent submodule checkout." - id: AC-MEMORY type: feature