Skip to content

Guard solve_up Enzyme reverse rule against runtime-activity aliased shadows#3740

Open
SebastianM-C wants to merge 1 commit into
SciML:masterfrom
SebastianM-C:smc/enzyme_fix
Open

Guard solve_up Enzyme reverse rule against runtime-activity aliased shadows#3740
SebastianM-C wants to merge 1 commit into
SciML:masterfrom
SebastianM-C:smc/enzyme_fix

Conversation

@SebastianM-C

Copy link
Copy Markdown
Member

While looking into reducing SciML/SciMLSensitivity.jl#1477 with Claude, I found a potential bug in the Enzyme rule under set_runtime_activity(Reverse) when the u0 is aliased in remake (which happens because SII.setsym_oop returns the prob.u0 if there are no changes). The fix that Claude proposed was to skip accumulation when the shadow aliases the primal, exactly as if it were Const.

@ChrisRackauckas Would this be appropriate or should we always copy in SII.setsym_oop (or both)?

Checklist

  • Appropriate tests were added
  • Any code changes were done in a way that does not break public API
  • All documentation related to code changes were updated
  • The new code follows the
    contributor guidelines, in particular the SciML Style Guide and
    COLPRAC.
  • Any new documentation only uses public API

Additional context

Add any other context about the problem here.

…hadows

Under `set_runtime_activity`, a runtime-inactive value reaches the rule as
Duplicated/MixedDuplicated whose shadow IS the primal (dval === val). The
reverse rule for `DiffEqBase.solve_up` accumulated every non-Const cotangent
into `ptr.dval` unconditionally, so e.g. the du0 cotangent was broadcast-added
into the caller's primal u0 whenever the solved problem's u0 aliased an array
reachable from a Const argument (the common `setsym_oop`/`remake` pattern in
MTK loss functions). The first gradient call returned correct results while
silently corrupting the Const problem's u0; subsequent calls were garbage.

Skip accumulation when the shadow aliases the primal — a runtime-inactive
value accumulates nowhere, exactly as if it were Const.

Found while reducing SciML/SciMLSensitivity.jl#1477 (failure mode B).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@ChrisRackauckas

Copy link
Copy Markdown
Member

Seems reasonable but this should definitely get a test.

@SebastianM-C

Copy link
Copy Markdown
Member Author

Sure, but where would that go? From what I see SciMLSensitivity related testing is done via downstream. Should I PR the test to SciMLSensitivity?

@ChrisRackauckas

Copy link
Copy Markdown
Member

either there or the downstream ad test group

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants