Summary
Test262's canonical harness runs each unflagged test (no flags: [onlyStrict] / [noStrict]) as two variants — once in sloppy mode and once with a "use strict" prologue. SharpTS currently runs each unflagged test once, sloppy only. This issue tracks adopting variant-pair execution so we match the canonical harness coverage.
Split out from #882 (now closed), where the investigation and the codegen prerequisites were completed. This is the remaining deliberate, maintainer-gated piece — a coverage/infrastructure enhancement, not a bug.
Why it was deferred from #882
Running unflagged tests as both variants is a baseline-semantics + coverage change, not a drop-in. Measured impact during #882:
- Baseline model changes — every unflagged test gains a second
path → bucket entry (a strict variant). The current model is one result per path; variant-pairs make it two.
- Run time roughly doubles — every unflagged test executes twice.
- It surfaces a real strict-failure surface — a 495-test unflagged sample run force-strict initially showed ~9% (49/495) pass sloppy but fail strict, with 0 strict-only gains.
That made it a planned rollout decision rather than an autonomous change, consistent with #882's "regenerate the baseline only after triage" stance.
What's already done (prerequisite work, shipped in #884)
The concrete strict-mode codegen bugs that variant-pair coverage would expose were fixed proactively in #882/#884, so the runner change won't land on a pile of red. The 495-test force-strict sample went 49 → 40 → 34 → 27 → 8 → 4 → 0 sloppy-Pass→strict-Fail across the sweep:
- strict dynamic property writes on built-in objects persist
- strict writes on Date/RegExp/Promise/Error (PDS) objects persist
- strict
delete removes Object.defineProperty (PDS) properties + honors configurability
- strict writes honor non-writable / accessor descriptors (
TypeError / setter invocation)
- strict indexed writes on
$Object receivers persist + honor preventExtensions
- strict symbol-keyed and globalThis-sentinel writes persist
onlyStrict tests now actually run strict (Assemble() prepends a program-level "use strict")
So the codegen floor is in place; the sampled systematic strict clusters are resolved. A full (non-sampled) regen is still needed to confirm suite-wide before/at rollout.
Scope of this issue
- Harness runner — for each unflagged test, execute both a sloppy variant and a
"use strict"-prefixed strict variant (the strict prologue mechanism already exists from the onlyStrict fix). onlyStrict / noStrict-flagged tests keep running their single designated variant.
- Baseline model — extend
path → bucket to distinguish the two variants per unflagged path (e.g. a variant suffix/qualifier on the key). Applies to both baselines/compiled.txt and baselines/interpreted.txt, interpreter + compiled runners.
- Full regeneration — regenerate both baselines under the new model (
SHARPTS_TEST262_UPDATE_BASELINE=1) and record the resulting strict-variant Fail entries; triage any clusters a full run surfaces beyond the 495-test sample.
- Run-time budget — confirm the ~2× wall-clock is acceptable for the subset config / CI, or gate strict variants behind a config flag if not.
Open decisions for the maintainer
- Go / no-go on adopting variant-pairs at all (vs. keeping sloppy-only and relying on
onlyStrict-flagged tests for strict coverage).
- Baseline key encoding for the second variant.
- Default-on vs. opt-in flag, given the run-time cost.
References
Summary
Test262's canonical harness runs each unflagged test (no
flags: [onlyStrict]/[noStrict]) as two variants — once in sloppy mode and once with a"use strict"prologue. SharpTS currently runs each unflagged test once, sloppy only. This issue tracks adopting variant-pair execution so we match the canonical harness coverage.Split out from #882 (now closed), where the investigation and the codegen prerequisites were completed. This is the remaining deliberate, maintainer-gated piece — a coverage/infrastructure enhancement, not a bug.
Why it was deferred from #882
Running unflagged tests as both variants is a baseline-semantics + coverage change, not a drop-in. Measured impact during #882:
path → bucketentry (a strict variant). The current model is one result per path; variant-pairs make it two.That made it a planned rollout decision rather than an autonomous change, consistent with #882's "regenerate the baseline only after triage" stance.
What's already done (prerequisite work, shipped in #884)
The concrete strict-mode codegen bugs that variant-pair coverage would expose were fixed proactively in #882/#884, so the runner change won't land on a pile of red. The 495-test force-strict sample went 49 → 40 → 34 → 27 → 8 → 4 → 0 sloppy-Pass→strict-Fail across the sweep:
deleteremovesObject.defineProperty(PDS) properties + honors configurabilityTypeError/ setter invocation)$Objectreceivers persist + honorpreventExtensionsonlyStricttests now actually run strict (Assemble()prepends a program-level"use strict")So the codegen floor is in place; the sampled systematic strict clusters are resolved. A full (non-sampled) regen is still needed to confirm suite-wide before/at rollout.
Scope of this issue
"use strict"-prefixed strict variant (the strict prologue mechanism already exists from theonlyStrictfix).onlyStrict/noStrict-flagged tests keep running their single designated variant.path → bucketto distinguish the two variants per unflagged path (e.g. a variant suffix/qualifier on the key). Applies to bothbaselines/compiled.txtandbaselines/interpreted.txt, interpreter + compiled runners.SHARPTS_TEST262_UPDATE_BASELINE=1) and record the resulting strict-variantFailentries; triage any clusters a full run surfaces beyond the 495-test sample.Open decisions for the maintainer
onlyStrict-flagged tests for strict coverage).References
SharpTS.Test262/(HarnessAssembler/Assemble(), baseline diff harness,baselines/{compiled,interpreted}.txt).