Skip to content

Default to PureKLU for generic-eltype sparse LU (e.g. BigFloat)#1037

Merged
ChrisRackauckas merged 2 commits into
SciML:mainfrom
ChrisRackauckas-Claude:prefer-pureklu-generic-sparse-default
Jun 13, 2026
Merged

Default to PureKLU for generic-eltype sparse LU (e.g. BigFloat)#1037
ChrisRackauckas merged 2 commits into
SciML:mainfrom
ChrisRackauckas-Claude:prefer-pureklu-generic-sparse-default

Conversation

@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor

Summary

For non-BLAS sparse element types (e.g. BigFloat), the default sparse LU was SparspakFactorization. That has two problems:

  1. It requires the user to using Sparspak (otherwise the default errors).
  2. Sparspak's symbolic-factorization reuse produces invalid factorizations when the sparsity pattern changes between solves — exactly what the nonstructural_zeros per-solve dropzeros path does. This manifested as stalled BVP Newton iterations for BigFloat problems (Misc test group red on master: BigFloat BVP solves stall — regression from LinearSolve v3.85 nonstructural-zeros reduction BoundaryValueDiffEq.jl#509).

This PR makes PureKLU (a pure-Julia, hard dependency that factors any Number element type and correctly re-analyzes on pattern changes) the default sparse LU for generic eltypes.

Changes

  • Generic-eltype init_cacheval for PureKLUFactorization and SparseColumnPivotedQRFactorization, so the default polyalgorithm's KLU and SPQR slots are concretely typed for non-BLAS eltypes (the SPQR one is required so the singular-LU → column-pivoted-QR fallback can setfield! a real factorization into its slot).
  • defaultalg(::AbstractSparseMatrixCSC) for square systems now selects the KLU (PureKLU) slot instead of Sparspak. No using Sparspak required.
  • Test (test/core/default_algs.jl): BigFloat sparse default chooses KLU, solves accurately, re-solves correctly across numeric updates, and falls back to QR on a singular system.
  • Also includes the resolve.jl PureUMFPACKFactorization test fix (pre-existing master failure; shared with Fix Sparspak symbolic reuse invalidated by per-solve dropzeros #1036).

Benchmark (BigFloat, PureKLU vs Sparspak)

Single solve (factorize+solve) and 5× same-pattern resolve, seconds:

n density alg single(s) resolve5(s) rel.res
10 0.5 PureKLU 1.39e-4 6.21e-4 9e-78
10 0.5 Sparspak 1.06e-4 4.93e-4 1e-77
50 0.2 PureKLU 4.58e-3 2.24e-2 2e-77
50 0.2 Sparspak 4.90e-3 2.55e-2 2e-77
100 0.1 PureKLU 3.20e-2 0.194 3e-77
100 0.1 Sparspak 3.52e-2 0.223 2e-77
200 0.05 PureKLU 0.219 1.20 5e-77
200 0.05 Sparspak 0.307 1.17 2e-77

PureKLU is competitive-to-faster (notably faster at larger sizes) with comparable accuracy, and needs no extra using.

Test plan

  • test/core/default_algs.jl passes (incl. new BigFloat tests)
  • test/core/resolve.jl passes
  • BigFloat default solve, caching re-solve, and singular→QR fallback verified
  • End-to-end: the MIRKN4 BigFloat SecondOrderBVProblem from BoundaryValueDiffEq.jl#509 now returns Success (was Stalled) with this change
  • Full CI

Relationship to #1036

#1036 hardens SparspakFactorization itself against the same pattern-change bug (defense-in-depth for users who explicitly request Sparspak). This PR addresses the root cause for the default path by not selecting Sparspak for generic eltypes at all. They are complementary; this PR is the more fundamental fix for the BVP issue.

Note: Please ignore until reviewed by @ChrisRackauckas.

PureUMFPACKFactorization (added in SciML#1004) is an AbstractSparseFactorization
but was missing from resolve.jl's sparse-conversion lists, so the Re-solve
testset exercised it with a dense matrix and errored in do_factorization.
Add it alongside the other sparse LU factorizations.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
For non-BLAS sparse element types (e.g. BigFloat), the default sparse LU
was SparspakFactorization, which requires `using Sparspak` and whose
symbolic-factorization reuse produces invalid factorizations when the
sparsity pattern changes across solves (the per-solve dropzeros of the
nonstructural_zeros path). That manifested as stalled BVP Newton
iterations for BigFloat problems (SciML/BoundaryValueDiffEq.jl#509).

PureKLU is a pure-Julia, hard dependency that factors any `Number`
element type and correctly re-analyzes on pattern changes, so make it the
default sparse LU for generic eltypes. Benchmarks (BigFloat, n=10..200)
show PureKLU is competitive-to-faster than Sparspak (e.g. n=200: 0.219s
vs 0.307s single solve) with comparable accuracy, and it needs no extra
`using`.

- Generic-eltype `init_cacheval` for PureKLU and SparseColumnPivotedQR so
  the default polyalgorithm's KLU and SPQR slots are concretely typed for
  non-BLAS eltypes (the latter is required for the singular-LU QR fallback
  to `setfield!` a real factorization).
- `defaultalg(::AbstractSparseMatrixCSC)` for square systems now selects
  the KLU (PureKLU) slot.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
@ChrisRackauckas ChrisRackauckas marked this pull request as ready for review June 13, 2026 09:16
@ChrisRackauckas ChrisRackauckas merged commit 52ebf25 into SciML:main Jun 13, 2026
26 of 55 checks passed
ChrisRackauckas-Claude pushed a commit to ChrisRackauckas-Claude/LinearSolve.jl that referenced this pull request Jun 14, 2026
The original PR widened the `{T, Int64}` / `{T, Int32}` PureKLU `init_cacheval`
methods from `Union{Float64, ComplexF64}` to `Union{Real, Complex}` to let
ForwardDiff duals dispatch on the direct dual solve path. After rebasing onto
main, that is unnecessary: SciML#1037 ("Default to PureKLU for generic-eltype sparse
LU") added a catch-all `where {T <: Number, Ti <: Integer}` method that already
builds the correct empty cache for any number eltype (duals included). The
widened specializations only duplicated it, so revert that file to main.
Verified: test/core/forwarddiff_overloads.jl (incl. the PureKLU sparse-dual
cases) passes with this file unchanged from main.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
ChrisRackauckas added a commit that referenced this pull request Jun 14, 2026
…arspak) since #1037 (#1043)

* Update DefaultsLoading test: sparse BigFloat defaults to PureKLU, not Sparspak

#1037 changed the default sparse LU for generic (non-BLAS) element types from
SparspakFactorization to PureKLU (a hard dependency), removing the
"SparspakFactorization required" error branch from `defaultalg`. The
DefaultsLoading test still asserted that error was thrown when solving a sparse
BigFloat problem before `using Sparspak`, so it failed on master with
"No exception thrown" (defaults_loading.jl:30).

Update the test to the new intended behavior: a generic-eltype sparse problem
now routes to `KLUFactorization` and solves without Sparspak loaded. Closes the
`@test_throws` into a positive assertion (defaultalg === KLUFactorization, solve
returns Vector{BigFloat}).

Fixes #1042.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>

* Apply suggestion from @ChrisRackauckas

---------

Co-authored-by: ChrisRackauckas-Claude <accounts@chrisrackauckas.com>
ChrisRackauckas added a commit that referenced this pull request Jun 14, 2026
* Opt PureKLU and RFLU out of split dual AD path

* Update forwarddiff_overloads.jl

* Apply suggestion from @ChrisRackauckas

* Fix direct dual solve path for PureKLU and non-dual b (#1024)

Two fixes for the direct dual opt-out path:

1. PureKLUFactorization's sparse init_cacheval methods were restricted to
   Float64/ComplexF64 eltypes, so Dual-eltype sparse matrices fell through
   to the nothing fallback and solve! crashed on cacheval.nzval. PureKLU's
   kernels are pure Julia and generic over Union{Real, Complex}, so widen
   the specializations accordingly.

2. The direct dual path built its inner problem from dual_A/dual_b as
   stored, but when only A carries duals b is a plain array, and __init
   takes the solution eltype from b, so the dual solution could not be
   stored (MethodError: no method matching Float64(::Dual)). This also
   affected the pre-existing GenericLUFactorization opt-out. Promote b to
   the cache's dual type in _solve_direct_dual!, and only take the direct
   path when A itself carries duals: with duals just in b, the split path
   (one primal factorization + partials back-solves) is strictly cheaper
   and works for all algorithms.

Adds test coverage for the duals-only-in-A and duals-only-in-b cases for
GenericLUFactorization, RFLUFactorization, and PureKLUFactorization.

Co-authored-by: Chris Rackauckas (Claude) <accounts@chrisrackauckas.com>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

* Drop redundant PureKLU init_cacheval widening (subsumed by #1037)

The original PR widened the `{T, Int64}` / `{T, Int32}` PureKLU `init_cacheval`
methods from `Union{Float64, ComplexF64}` to `Union{Real, Complex}` to let
ForwardDiff duals dispatch on the direct dual solve path. After rebasing onto
main, that is unnecessary: #1037 ("Default to PureKLU for generic-eltype sparse
LU") added a catch-all `where {T <: Number, Ti <: Integer}` method that already
builds the correct empty cache for any number eltype (duals included). The
widened specializations only duplicated it, so revert that file to main.
Verified: test/core/forwarddiff_overloads.jl (incl. the PureKLU sparse-dual
cases) passes with this file unchanged from main.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>

---------

Co-authored-by: Christopher Rackauckas <accounts@chrisrackauckas.com>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants