Widen UJacobianWrapper.p for nested ForwardDiff through Rosenbrock (#3381)#3414
Merged
ChrisRackauckas merged 1 commit intoApr 30, 2026
Conversation
3d17dc0 to
eca731d
Compare
|
@ChrisRackauckas Earlier I thought the ForwardDiff < was the root cause as tagcount is What do you think of this? |
…ciML#3381) When a Rosenbrock integrator runs with a Vector{<:Dual} p (i.e. we are inside an outer ForwardDiff layer - e.g. a NonlinearLeastSquares parameter fit), the inner Jacobian widens u into a deeper nested-Dual type via its prepared JacobianConfig, but p in UJacobianWrapper is still at the outer Dual level. The user body then multiplies p[i] * u[i] across two different Dual nesting levels and dispatches through ForwardDiff's @generated tagcount precedence whose literal value is baked at first compile and varies with precompile ordering. That ordering can invert the nesting and crash setindex!(du, ...) with MethodError: no method matching Float64(::nested_dual). Two-layer fix: * OrdinaryDiffEqDifferentiation/src/derivative_wrappers.jl: `jacobian!` now lifts `uf.p` into the inner nested-Dual type once (via `_widen_uf_p_for_jac`) before delegating to `DI.jacobian!`. The widened `p` carries zero inner partials (correct - `p` does not depend on `u`). One `convert.(inner_T, p)` per `jacobian!` call is amortized across every chunk DI evaluates - no per-call allocation in the hot loop. * DiffEqBase/ext/DiffEqBaseForwardDiffExt.jl: `wrapfun_iip` now compiles the Jacobian-case FunctionWrapper slots with the promoted `p` type, so that AutoSpecialize (FWW-wrapped) callers dispatch to a nested-`p` slot whose compiled body multiplies `u*p` within one tag hierarchy. `_dualify_eltype` bypasses ForwardDiff's unstable tag-precedence `promote_type` for the already-Dual case (which was inverting the outer tag non-deterministically across precompile boundaries) and uses the seeded DualT directly. Without the DiffEqBase change the AutoSpecialize path either matched a slot compiled with the wrong tag ordering (silently cross-tag-multiplying and crashing in `setindex!`) or missed FWW dispatch entirely with `NoFunctionWrapperFoundError`, depending on precompile order. Adds test/nested_forwarddiff_tests.jl covering FullSpecialize, a hand-rolled outer tag (the deterministic reproducer pre-fix), and the AutoSpecialize / FunctionWrappersWrapper path. Bumps OrdinaryDiffEqDifferentiation 2.9.0 -> 2.9.1 and DiffEqBase 6.216.0 -> 6.216.1. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
eca731d to
6e61344
Compare
2 tasks
ChrisRackauckas-Claude
pushed a commit
to ChrisRackauckas-Claude/OrdinaryDiffEq.jl
that referenced
this pull request
May 13, 2026
…ciML#3381) Re-applies the changes from PR SciML#3414 on top of current master. The original PR was reverted by SciML#3586; this restores it now that SciML#3488 (JacReuseState dtgamma fix) has landed and the two changes coexist without conflict (they touch disjoint files). Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
ChrisRackauckas-Claude
pushed a commit
to ChrisRackauckas-Claude/OrdinaryDiffEq.jl
that referenced
this pull request
May 21, 2026
…ciML#3381) Re-applies the changes from PR SciML#3414 on top of current master. The original PR was reverted by SciML#3586; this restores it now that SciML#3488 (JacReuseState dtgamma fix) has landed and the two changes coexist without conflict (they touch disjoint files). Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix for #3381. Revives the approach from the (closed) #3389 — widen
UJacobianWrapper.ponce insidejacobian!inOrdinaryDiffEqDifferentiation, beforeDI.jacobian!runs, so the per-chunk hot loop is allocation-free. Supersedes #3412.Summary
_widen_uf_p_for_jac(f, prep)inderivative_wrappers.jlinspects the preparedForwardDiff.JacobianConfigto recover the inner nested-Dual eltype and liftsuf.pinto that type via a freshUJacobianWrapperwithconvert.(inner_T, p). Runs once perjacobian!call — every chunk DI evaluates inside that call reuses the widenedp._widen_uf_p_for_jac(f, prep) = f) is a single type-stable no-op for non-nested,AutoFiniteDiff, or FWW-wrapped cases.FunctionWrappersWrapper(AutoSpecialize): DiffEqBase already precompiles the(nested_u, outer_p, t)FW slot, so the unwidened call matches; widening would yield(nested_u, nested_p, t)that matches no slot and tripsAllowNonIsBitsintoNoFunctionWrapperFoundError.test/nested_forwarddiff_tests.jl. The hand-rolled outer-tag case deterministically reproduces the crash pre-fix (MethodError: no method matching Float64(::Dual{...nested...})) and passes post-fix.OrdinaryDiffEqDifferentiationto 2.9.1.Root cause
NonlinearSolveseedspwithDual{NonlinearSolveTag, Float64, 2}; insideresid!the Rosenbrock solve seedsuunderOrdinaryDiffEqTag. For aFullSpecializeODEFunction, the user body multipliesp[i] * u[i]across two different Dual nesting levels. Cross-tag multiplication resolves through ForwardDiff's@generated tagcountprecedence — the literal value is baked at first compile and depends on which package precompiled which tag first. That ordering can invert the nesting and produce a triple-nestedDualthat cannot be stored back intodu:Lifting
pon the inner solver's side normalizes both sides ofp[i] * u[i]to the same nested-Dual type, so the cross-tag arithmetic never happens.Why here, not DiffEqBase
A DiffEqBase-side change could promote the compiled FWW signature's
pslot to carry the nested Dual, but it needs either an eagerVector{DualU}allocation perUJacobianWrappercall, or a lazyWidenedDualArraywrapper whose exact type parameters have to match the compiled FW slot. Neither is free, and both push the bug-class workaround into a package that otherwise does not know about OrdinaryDiffEq's Jacobian seeding. Widening once injacobian!costs a singleconvert.(inner_T, p)per step and covers every chunk DI evaluates inside that step.Test plan
Pkg.test(\"OrdinaryDiffEqDifferentiation\")Core group passes (25 tests, including 4 new nested-ForwardDiff tests).StalledSuccesswith the fix.🤖 Generated with Claude Code