Skip to content

Opt PureKLU and RFLU out of split dual AD path (rebase of #1023)#1041

Merged
ChrisRackauckas merged 5 commits into
SciML:mainfrom
ChrisRackauckas-Claude:pr1023-rebase-pureklu-rflu-dual
Jun 14, 2026
Merged

Opt PureKLU and RFLU out of split dual AD path (rebase of #1023)#1041
ChrisRackauckas merged 5 commits into
SciML:mainfrom
ChrisRackauckas-Claude:pr1023-rebase-pureklu-rflu-dual

Conversation

@ChrisRackauckas-Claude

@ChrisRackauckas-Claude ChrisRackauckas-Claude commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Rebases and finalizes #1023 (branch ChrisRackauckas-patch-1) onto the current main. The original PR went stale (mergeable_state: dirty); I couldn't push to the SciML-owned branch, so this is the rebased version from a fork. It carries the same commits (authored by @ChrisRackauckas), now conflict-free on main. Supersedes #1023 — close that one in favor of this.

What it does (unchanged intent)

Opt PureKLUFactorization and RFLUFactorization out of the split primal/partials dual-AD path so they solve the Dual problem directly, and only take the direct path when A itself carries duals (with duals only in b, the split path — one primal factorization + partials back-solves — is cheaper than factorizing in dual arithmetic). Also promotes a plain b to the dual eltype on the direct path.

Net change is now just LinearSolveForwardDiffExt.jl + the test

The original PR also widened two PureKLU init_cacheval methods in LinearSolveSparseArraysExt.jl from Union{Float64, ComplexF64} to Union{Real, Complex}. That is dropped here — after rebasing onto main, it's redundant: #1037 ("Default to PureKLU for generic-eltype sparse LU") added a catch-all where {T <: Number, Ti <: Integer} method that already builds the correct empty cache for any number eltype, duals included. The widened specializations only duplicated it, so that file is left identical to main. (The last commit reverts it, with the reasoning; verified the PureKLU sparse-dual tests still pass.)

Conflicts resolved during rebase

main independently evolved both touched spots:

  • ext/LinearSolveForwardDiffExt.jl_use_direct_dual_solve had gained SpecializedLUFactorization/SpecializedQRFactorization on main. Resolved as the union: GenericLU || SpecializedLU || SpecializedQR || PureKLU || RFLU.
  • test/core/forwarddiff_overloads.jlmain switched the direct-path comparison to a robust dual_isapprox helper (plain over Dual vectors NaNs when a primal diff is exactly zero). Kept that helper and the Specialized* + least-squares tests, and re-expressed the PR's GenericLU/RFLU checks through dual_isapprox (they take the same direct path, so the same ulp/NaN concern applies). Unioned the using SpecializingFactorizations / using RecursiveFactorization imports.

Local verification

Julia 1.12 / Linux. Runic --check clean. test/core/forwarddiff_overloads.jl runs green end-to-end (all testsets, including the merged dual_isapprox checks for SpecializedLU/SpecializedQR/GenericLU/RFLU, the least-squares case, the plain-A/plain-b opt-out cases, and the PureKLU sparse-dual tests) — both with and without the now-dropped ext change.


Please ignore until reviewed by @ChrisRackauckas.

🤖 Generated with Claude Code

ChrisRackauckas and others added 4 commits June 14, 2026 04:31
Two fixes for the direct dual opt-out path:

1. PureKLUFactorization's sparse init_cacheval methods were restricted to
   Float64/ComplexF64 eltypes, so Dual-eltype sparse matrices fell through
   to the nothing fallback and solve! crashed on cacheval.nzval. PureKLU's
   kernels are pure Julia and generic over Union{Real, Complex}, so widen
   the specializations accordingly.

2. The direct dual path built its inner problem from dual_A/dual_b as
   stored, but when only A carries duals b is a plain array, and __init
   takes the solution eltype from b, so the dual solution could not be
   stored (MethodError: no method matching Float64(::Dual)). This also
   affected the pre-existing GenericLUFactorization opt-out. Promote b to
   the cache's dual type in _solve_direct_dual!, and only take the direct
   path when A itself carries duals: with duals just in b, the split path
   (one primal factorization + partials back-solves) is strictly cheaper
   and works for all algorithms.

Adds test coverage for the duals-only-in-A and duals-only-in-b cases for
GenericLUFactorization, RFLUFactorization, and PureKLUFactorization.

Co-authored-by: Chris Rackauckas (Claude) <accounts@chrisrackauckas.com>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
The original PR widened the `{T, Int64}` / `{T, Int32}` PureKLU `init_cacheval`
methods from `Union{Float64, ComplexF64}` to `Union{Real, Complex}` to let
ForwardDiff duals dispatch on the direct dual solve path. After rebasing onto
main, that is unnecessary: SciML#1037 ("Default to PureKLU for generic-eltype sparse
LU") added a catch-all `where {T <: Number, Ti <: Integer}` method that already
builds the correct empty cache for any number eltype (duals included). The
widened specializations only duplicated it, so revert that file to main.
Verified: test/core/forwarddiff_overloads.jl (incl. the PureKLU sparse-dual
cases) passes with this file unchanged from main.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
@ChrisRackauckas ChrisRackauckas marked this pull request as ready for review June 14, 2026 09:06
@ChrisRackauckas ChrisRackauckas merged commit c0544d2 into SciML:main Jun 14, 2026
41 of 55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants