Skip to content

Mark RecursiveFactorization Enzyme tests broken (Enzyme >= 0.13.155 regression)#1032

Merged
ChrisRackauckas merged 3 commits into
SciML:mainfrom
ChrisRackauckas-Claude:cap-enzyme-0.13.154-nopre
Jun 10, 2026
Merged

Mark RecursiveFactorization Enzyme tests broken (Enzyme >= 0.13.155 regression)#1032
ChrisRackauckas merged 3 commits into
SciML:mainfrom
ChrisRackauckas-Claude:cap-enzyme-0.13.154-nopre

Conversation

@ChrisRackauckas-Claude

@ChrisRackauckas-Claude ChrisRackauckas-Claude commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Please ignore until reviewed by @ChrisRackauckas.

Problem

The tests / NoPre (julia lts, self-hosted) job fails on main and all PR branches since ~2026-06-10 05:00 UTC. Culprit: Enzyme v0.13.155 (released 04:41 UTC that day; the only package delta between the last passing and first failing runs). It crashes at compile time —

MethodError: no method matching LLVM.ConstantInt(::LLVM.VectorType, ::Int64)
  in abs_typeof at src/absint.jl:857

— whenever the primal call graph contains VectorizationBase explicit-SIMD gathers (single-index vector GEPs), even behind custom EnzymeRules. For LinearSolve that means RecursiveFactorization's TriangularSolve kernels. Upstream issue with an Enzyme-only MWE (verified crash on 0.13.155, clean on 0.13.154): EnzymeAD/Enzyme.jl#3164.

Fix (per review: mark broken, don't cap)

The original cap of Enzyme to 0.13.154 is reverted. Instead, the three — and only three — affected call sites in test/nopre/enzyme.jl are marked @test_broken, each commented with the upstream issue link:

  1. The DefaultLinearSolver reverse-mode call (with RecursiveFactorization loaded, the default solver compiles the RFLU branch);
  2. The explicit RFLUFactorization reverse-mode cache-reuse test (f2);
  3. The Forward-mode RFLU jacobian in the alg-loop testset.

Every other site was probed individually on 0.13.155 (plain LUFactorization even with RF loaded, DefaultLinearSolver without RF, cache init/solve! paths, KrylovJL_GMRES, BunchKaufman, and all the structured-wrapper testsets) and passes — those keep running on latest Enzyme. When a fixed Enzyme release lands, the @test_brokens flip to unexpected-pass errors, flagging them for removal.

Verification

Julia 1.10.11, Enzyme 0.13.155, full test/nopre/enzyme.jl:

Test Summary:           | Pass  Broken  Total     Time
Enzyme Derivative Rules |   40       3     43  5m32.5s

(exit 0; on the previous capped approach see the earlier comments — the cap verification also passed, but capping hides the regression instead of tracking it.)

🤖 Generated with Claude Code

Enzyme v0.13.155 (released 2026-06-10, via EnzymeAD/Enzyme.jl#3154) throws
MethodError: no method matching LLVM.ConstantInt(::LLVM.VectorType, ::Int64)
in Enzyme.Compiler.abs_typeof (src/absint.jl:857) when compiling the
LinearSolve Enzyme derivative rules tests on Julia 1.10, breaking the
NoPre (julia lts) CI job. Cap until fixed upstream.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Full regression analysis, in upstream-issue form for whenever this gets filed on EnzymeAD/Enzyme.jl:


Title: v0.13.155 regression: MethodError: LLVM.ConstantInt(::LLVM.VectorType, ::Int64) in abs_typeof GEP stride analysis (Julia 1.10)

Enzyme v0.13.155 (Enzyme_jll 0.0.266+0, released 2026-06-10T04:41:53Z) introduced a regression on Julia 1.10.11 that breaks reverse-mode compilation of LinearSolve.jl's derivative rules. v0.13.154 works. Diffing the Pkg resolution logs of the last passing and first failing CI runs shows Enzyme 0.13.154→0.13.155 / Enzyme_jll 0.0.265→0.0.266 is the only package delta — LLVM.jl (9.8.2) and GPUCompiler (1.20.0) unchanged, ruling LLVM.jl out as the trigger.

LoadError: MethodError: no method matching LLVM.ConstantInt(::LLVM.VectorType, ::Int64)
Closest candidates are:
  LLVM.ConstantInt(::LLVM.IntegerType, ::Union{Int64, UInt64})
   @ LLVM ~/.julia/packages/LLVM/eBGq5/src/core/value/constant.jl:117
Stacktrace:
  [1] abs_typeof(arg::LLVM.Value, partial::Bool, seenphis::Set{LLVM.PHIInst})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/eARJy/src/absint.jl:857
  [2] abs_typeof @ ~/.julia/packages/Enzyme/eARJy/src/absint.jl:366 [inlined]
  [3] compile_unhooked @ ~/.julia/packages/Enzyme/eARJy/src/compiler.jl:5834

Root cause: the new GEP stride-analysis block added in EnzymeAD/Enzyme.jl#3154 "Improve type lowering for ptr memcpy" (commit 986746fe, src/absint.jl ~lines 835-905) calls LLVM.ConstantInt(value_type(idx), 0) / LLVM.ConstantInt(value_type(idx), 1) on a non-constant GEP index. For a vector GEP, value_type(idx) is an LLVM.VectorType, not an LLVM.IntegerType, so there is no matching LLVM.ConstantInt method. The block should bail out (or splat a constant vector) when value_type(idx) is not an IntegerType.

Reproduction (Julia 1.10.11, Enzyme v0.13.155, LLVM.jl 9.8.2, GPUCompiler 1.20.0) — this is test/nopre/enzyme.jl:45 reduced to its first failing call:

using Enzyme, LinearSolve, LinearAlgebra
n = 4
A = rand(n, n); dA = zeros(n, n); b1 = rand(n); db1 = zeros(n)
function f(A, b; alg = LUFactorization())
    prob = LinearProblem(A, b)
    sol = solve(prob, alg)
    norm(sol.u)
end
Enzyme.autodiff(Reverse, f, Duplicated(copy(A), dA), Duplicated(copy(b1), db1))

Only the lts (1.10) CI job hits this because LinearSolve gates Enzyme tests to VERSION < v"1.12.0-"; the failing path is also IR-dependent (vector GEPs from the 1.10 pipeline). Verified locally: with Enzyme pinned to 0.13.154 the full test/nopre/enzyme.jl passes; with 0.13.155 it fails exactly as in CI.


The compat cap in this PR is the repo-side mitigation and should be reverted once an Enzyme release fixes the above.

…zyme

Enzyme >= 0.13.155 crashes compiling any primal containing
VectorizationBase explicit-SIMD gathers (vector GEP -> MethodError:
LLVM.ConstantInt(::LLVM.VectorType, ::Int64) in abs_typeof), which hits
RecursiveFactorization's TriangularSolve kernels. Upstream issue with an
Enzyme-only MWE: EnzymeAD/Enzyme.jl#3164

Only three sites compile RF kernels (verified by per-site probes on
Enzyme 0.13.155/0.13.154): the DefaultLinearSolver reverse call (RF
loaded makes the default compile the RFLU branch), the explicit
RFLUFactorization reverse call, and the Forward-mode RFLU jacobian.
These are now @test_broken so they flip to unexpected-pass errors when
a fixed Enzyme lands; all other Enzyme tests run as before on latest
Enzyme. Full file on 0.13.155/Julia 1.10.11: 40 pass, 3 broken.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
@ChrisRackauckas-Claude ChrisRackauckas-Claude changed the title Cap Enzyme to <=0.13.154 in NoPre test env to fix NoPre lts CI Mark RecursiveFactorization Enzyme tests broken (Enzyme >= 0.13.155 regression) Jun 10, 2026
@ChrisRackauckas ChrisRackauckas marked this pull request as ready for review June 10, 2026 11:00
@ChrisRackauckas ChrisRackauckas merged commit 4f78c7d into SciML:main Jun 10, 2026
8 of 13 checks passed
@wsmoses

wsmoses commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@ChrisRackauckas-Claude this was fixed, can you revert this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants