Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions .github/workflows/testsuite-repin.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
name: Testsuite Re-pin & Conformance Baseline

# Regularly re-pins external/testsuite (WebAssembly/testsuite) to latest upstream
# and re-measures WAST conformance, opening a PR for review.
#
# Why: the submodule must stay PINNED (committed gitlink == checked-out commit)
# so every checkout / CI run / git worktree exercises the SAME suite — otherwise
# conformance numbers silently diverge between environments (this confound was
# kiln#360: main had 7e0b83a checked out while the gitlink recorded a different
# commit, so a fresh worktree tested a different suite). A fixed old pin goes
# stale against the evolving spec, so this job moves the pin forward on a cadence
# and surfaces the new conformance baseline for human review (spec evolution can
# add failing files, so the new baseline must be reviewed, not auto-merged).

on:
schedule:
- cron: '0 4 1 * *' # monthly, 1st @ 04:00 UTC
workflow_dispatch:

permissions:
contents: write
pull-requests: write

jobs:
repin:
name: Re-pin testsuite + re-measure conformance
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
with:
submodules: recursive

- name: Install Rust
uses: dtolnay/rust-toolchain@stable

- name: Bump testsuite to latest upstream
id: bump
run: |
cd external/testsuite
git fetch origin
git checkout origin/main
echo "commit=$(git rev-parse HEAD)" >> "$GITHUB_OUTPUT"
echo "short=$(git rev-parse --short HEAD)" >> "$GITHUB_OUTPUT"

- name: Run WAST conformance (legacy excluded by the runner)
id: conf
continue-on-error: true
run: |
cargo run --release -p cargo-kiln -- testsuite --run-wast \
--wast-dir external/testsuite | tee conformance.txt
{
echo "summary<<EOF"
grep -E 'Files:|Assertions:' conformance.txt
echo "EOF"
} >> "$GITHUB_OUTPUT"

- name: Open re-pin PR
uses: peter-evans/create-pull-request@v7
with:
add-paths: external/testsuite
branch: chore/testsuite-repin
delete-branch: true
title: "chore(testsuite): re-pin to ${{ steps.bump.outputs.short }} + conformance re-baseline"
commit-message: |
chore(testsuite): re-pin external/testsuite to ${{ steps.bump.outputs.short }}

Automated re-pin to latest WebAssembly/testsuite + conformance
re-measure. Review the new baseline before merging. See kiln#360.
body: |
Automated re-pin of `external/testsuite` to latest upstream
(`${{ steps.bump.outputs.commit }}`), keeping the submodule pinned so
CI / worktrees run a reproducible suite (kiln#360).

**WAST conformance against the new suite:**
```
${{ steps.conf.outputs.summary }}
```

Review the delta vs the previous baseline before merging, and update
the conformance note in `safety/requirements/architecture-components.yaml`.
2 changes: 1 addition & 1 deletion safety/requirements/architecture-components.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ artifacts:
fields:
previous-id: ARCH_COMP_001
implementation: kiln-runtime/src/stackless/engine.rs
note: "Infrastructure implemented; 332 WAST assertion failures remain across 74/280 files (verified 2026-06-13, testsuite @7e0b83a), down from ~1,500. Dominated by GC subtyping (type-subtyping, br_on_cast×4, i31, struct/array, ref_*), linking/imports (instance, imports*), tables/elem. Issue 149; unreached-invalid excess-value cases are Issue 146."
note: "Infrastructure implemented. Conformance baseline (measured from source; testsuite re-pinned to WebAssembly/testsuite @193e551, 2026-06-17): 263/281 files pass; 65980 assertions pass / 529 fail. Remaining failures concentrated in 17 files, dominated by the GC custom-descriptors proposal (proposals/custom-descriptors/*) + br_on_cast/br_on_null/return_call_ref (Issue 149). type-subtyping.wast now passes (PR #359). The submodule is PINNED (gitlink == checkout) and moved forward on a cadence by .github/workflows/testsuite-repin.yml so the number is reproducible across CI/worktrees (kiln#360). The earlier '332 across 74 files @7e0b83a' figure was stale and measured against a divergent submodule checkout."

- id: AC-MEMORY
type: feature
Expand Down
Loading