Skip to content

Vsvo embedded bdf#3391

Open
KeshavVenkatesh wants to merge 2 commits into
SciML:masterfrom
KeshavVenkatesh:vsvo-embedded-bdf
Open

Vsvo embedded bdf#3391
KeshavVenkatesh wants to merge 2 commits into
SciML:masterfrom
KeshavVenkatesh:vsvo-embedded-bdf

Conversation

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor

Checklist

  • [Yes ] Appropriate tests were added
  • [ No] Any code changes were done in a way that does not break public API
  • [ Yes] All documentation related to code changes were updated
  • [ Yes] The new code follows the
    contributor guidelines, in particular the SciML Style Guide and
    COLPRAC.
  • Any new documentation only uses public API

Additional context

This PR contains changes to stabiity filter for backward euler and also VSVO implementation for BDF based on the paper here
Add any other context about the problem here.

Sorry for the delay it took me a whie to fully understand the paper and make sure the implementation matches with the algorithm defined in the paper

@ChrisRackauckas : Please review and provide feedback. I did rebase this time too.

@oscardssmith

Copy link
Copy Markdown
Member

How do the benchmarks look for this?

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

@oscardssmith @ChrisRackauckas
I did a benchmark test comparing MOOSE234 and FBDF mehtods. Even though MOOSE234 takes fewer steps on stiff problems, each step requires more Newton iterations (f-evals), so the total cost is higher. I tried an adaptive predictor degree approach but it did not close the gap. I'm investigating further and open to suggestions.

@ChrisRackauckas

Copy link
Copy Markdown
Member

Show the work precision diagrams.

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

OOSE234 vs FBDF: Findings

Looking into why MOOSE234 uses more f-evals than FBDF. Here's what came out of
the analysis.

How I got here

Started by running work-precision diagrams (error vs f-evals) on VdP, ROBER,
and HIRES comparing MOOSE234 against FBDF, QNDF, CVODE_BDF, Rodas5P, RadauIIA5,
and KenCarp4. The plots showed MOOSE234's curve consistently shifted to the right
of FBDF — same accuracy but more work.

Since both solvers use the exact same Newton solver (nlsolve!), same convergence
logic, same maxiters=10, same kappa=1/100, the difference had to come from what
each method feeds into Newton or does around it. I wrote a step-by-step
diagnostic (newton_diagnostics.jl) that hooks into init()/step!() and logs
per-step Newton iterations, f-evals, Jacobian updates, and order for both solvers.

That trace showed two clear patterns:

  • MOOSE234: nf per step = newton_iter + 1 (consistently)
  • FBDF: nf per step = newton_iter (consistently)

So every single step, MOOSE234 was spending one extra f-eval that FBDF wasn't.
Traced it to the f(fsallast) call at the end of perform_step!, then confirmed
that nothing reads that value (isfsal=false, interpolation uses Chebyshev nodes).

After fixing that, re-ran the work-precision plots and the per-step traces
across all four problems to confirm the improvement and understand what gap
remains.

What was wrong

MOOSE234 was calling f(fsallast) after every step that was redundant.
FBDF skips this by recovering f(u) algebraically from the BDF equation.
Removed the dead call — saves exactly 1 f-eval per accepted step.

Numbers (MOOSE234 after fix vs FBDF)

Column key:
tol = solver tolerance (abstol = reltol = this value)
MOOSE234 = total function evaluations (nf) for MOOSE234
FBDF = total function evaluations (nf) for FBDF
Ratio = MOOSE234 nf / FBDF nf (lower is better for MOOSE234)
M steps = accepted steps taken by MOOSE234
F steps = accepted steps taken by FBDF

VdP (mu=1000):
tol MOOSE234 FBDF Ratio M steps F steps
1e-4 43 32 1.34x 24 21
1e-6 82 56 1.46x 37 43
1e-8 213 99 2.15x 71 75
1e-10 246 175 1.41x 194 156

(Before the fix, VdP at 1e-6 was 122 nf. Now 82. That's a 33% drop.)

ROBER:
tol MOOSE234 FBDF Ratio M steps F steps
1e-6 1142 569 2.01x 193 217
1e-8 5778 958 6.03x 785 495
1e-10 5128 1899 2.70x 3527 1184

HIRES (8 eqns):
tol MOOSE234 FBDF Ratio M steps F steps
1e-4 406 420 0.97x 47 85
1e-6 996 618 1.61x 159 169
1e-8 2017 1064 1.90x 640 385
1e-10 7210 2097 3.44x 2923 854

Linear stiff:
tol MOOSE234 FBDF Ratio M steps F steps
1e-4 128 80 1.60x 41 55
1e-6 208 135 1.54x 116 102
1e-8 525 299 1.76x 438 224
1e-10 2009 675 2.98x 1918 503

Why MOOSE234 is still more expensive after the fix

Three things:

  1. FBDF goes up to order 5, MOOSE234 caps at 4. At tight tolerances this really
    shows — FBDF takes way fewer steps (ROBER at 1e-10: 1184 steps vs 3527).

  2. On VdP at loose tolerances, MOOSE234 gets stuck at order 2. At tol=1e-6 it's
    order 2 for 97% of steps. The other problems are fine — ROBER/HIRES/Linear
    all reach order 4 without issues. The problem is the order controller needs
    est3 < est2 to try order 4, and on VdP the FBDF4 filter correction is always
    bigger than the BDF3-Stab correction, so it never tries.

    Order 3 is basically never picked on any problem either — the method jumps
    straight from 2 to 4:

    Problem Ord 2 Ord 3 Ord 4 (at tol=1e-6)
    VdP 97% 0% 3%
    ROBER 10% 0% 90%
    HIRES 16% 1% 83%
    Linear 31% 1% 68%

  3. More Newton iterations in VdP's transient region. MOOSE234 has 5 steps needing
    4-6 Newton iters (t < 0.005) vs FBDF's 1 step with 3 iters. Being stuck at
    order 2 means a worse predictor, so Newton has to work harder.

What I am looking at next

  • Compute Est4 properly so the order controller can pick order 4 on VdP. The
    paper has some details I want to review.

  • Figure out why order 3 is never selected. Might be a problem with how the
    error estimators are set up.

  • MOOSE234 rejects a lot more steps than FBDF (HIRES at 1e-10: 109 vs 21).
    The step size controller might be too aggressive.

Work-precision plots (updated with fix)

Re-ran the full benchmark suite (7 solvers, tolerances 1e-4 to 1e-10) on VdP,
ROBER, and HIRES after the fix. Plots are in test/benchmark_results/.

VdP: MOOSE234 tracks close to FBDF at loose tolerances but falls behind at tight
tolerances where FBDF's order-5 capability kicks in. CVODE_BDF and Radau are the
most efficient on this problem overall.

ROBER: MOOSE234 is the most expensive solver across the board. Its curve is far
to the right of everything else, especially at tight tolerances. This is the
worst-case problem for MOOSE234.

HIRES: MOOSE234 is competitive at loose tolerances — its curve overlaps with
FBDF and QNDF around 1e-4 to 1e-5 error. The gap grows at tight tolerances
but stays smaller than on ROBER. This is the closest to a "win" for MOOSE234
among the problems tested.

Across all three problems, the nf-based work-precision plots confirm that the
f(fsallast) fix helped (curves shifted left compared to pre-fix runs) but
MOOSE234 still has structural disadvantages vs FBDF from the order cap and
order controller behavior described above.

hires_work_precision_nf hires_work_precision_nsolve rober_work_precision_nf rober_work_precision_nsolve vanderpol_work_precision_nf vanderpol_work_precision_nsolve

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

cc @oscardssmith @ChrisRackauckas : Please review and provide suggestions. i plan to spend more time over the weekend to dive deep.

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

Since it is a variable order, u_history might be corrupted with incorrect values causing estimate for est3 and est2 to be error prone. I am looking at storing the raw values and computing everytime.

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

I spent more time on this, from my understanding the paper is not claiming improvement over FBDF methods instead in shows improvement only against its own fixed-order components (BDF3-Stab, BDF3, FBDF4). I am reconfirming this and also regenerating the plots. Will it help to contact paper authors to confirm and also discuss gaps we are seeing with FBDF methods?

@ChrisRackauckas

Copy link
Copy Markdown
Member

Probably not, it's not a great paper so they didn't check against anything of use. This is all beyond the paper

@ChrisRackauckas

Copy link
Copy Markdown
Member

Is there a reason this couldn't just be added as an adjustment option to QNDF/FBDF? I don't see why it needs a separate stepper.

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

Is there a reason this couldn't just be added as an adjustment option to QNDF/FBDF? I don't see why it needs a separate stepper.

Yes I will look into this.

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

These are the datapoints I can see from the MOOSE234 implementation

1 For the same error, MOOSE234 takes fewer steps than BDF3-Stab
Both MOOSE234 and BDF3-Stab (order 2 locked) do exactly one BDF3 solve per step. MOOSE234 adaptively selects order 4 for 87-99% of steps via the FBDF4 filter, upgrading from O(h²) to O(h⁴) convergence at zero additional solve cost. This helps reduce the overall error

  1. BDF4 alone wastes 30% of compute on rejections

FBDF4 (order 4 locked, no A-stable fallback) is not stable during VdP's sharp transients. It repeatedly tries steps that fail, wasting BDF3 solves on rejected steps. MOOSE234 avoids this by switching to the A-stable BDF3-Stab (order 2) during transients.

value_prop_combined work_comparison_bars

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

Is there a reason this couldn't just be added as an adjustment option to QNDF/FBDF? I don't see why it needs a separate stepper.

We could incoportate the following ideas from MOOSE234 to FBDF as options:

BDF3-Stab as an A-stable fallback — instead of dropping to BDF1/BDF2 during difficult transients, apply the G-stable BDF3-Stab filter (smaller error constant than BDF2, about 2.7x)

Do you think I should try to apply this and see for FBDF?

@ChrisRackauckas

Copy link
Copy Markdown
Member

yup and then make it an option in fbdf

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

I want to share my understanding and some updates from my side.

  1. Hypothesis: For problems where we are spending steps in BDF2 (order2) , this stab filter approach from MOOSE234 might help with reduced number of steps for FBDF and possibly reduce error (2.7x) during these order 2 steps.
  2. For problems like ROBER, HIRES and VdP problems less than 10% of the steps are executed under order 2 and so gains does not seem very high.
  3. However I did the implementation to leverage the BDF3-stab if order 2 is reached in a step in FBDF.
  4. I wrote a test case to compare the accuracy and efficiency of this approach and these are the results.

Problem Default steps StabFB steps Change Default rej% StabFB rej%
VdP (t=3000) 9,201 9,515 -3.4% 12.8% 15.2%
VdP (t=6.3) 497 686 -38% 0.8% 16.8%
ROBER 3,313 3,392 -2.4% 1.4% 2.8%
HIRES 2,525 2,688 -6.5% 3.8% 7.0%

As you can see, this approach is resulting in more steps than FBD4 contradictory to our hypothesis.

  1. My interpretation: When we switch to BDF3-stab we have to compute BDF3 which is not A-Stable during stiff transients, Newton (which is BDF3) may not converge at the dt that was chosen for BDF2. To apply the BDF3-Stab we have to solve a BDF3 which is not A-Stable . This resulted in larger number of rejections.

Please correct me If I am mistaken in my implementation or plan. If this is what you had in mind, IMHO it not worth implementing MOOSE234 optiomization for FBDF algorithm

@ChrisRackauckas

Copy link
Copy Markdown
Member

count the number of steps at a given order.

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

Order Distribution at tol=1e-6
Comparison of FBDF and StabFB

Problem Variant Steps Ord 1 Ord 2 Ord 3 Ord 4 Ord 5 Rej Error
VdP (t=3000) Default 1397 31 (2%) 119 (9%) 111 (8%) 112 (8%) 1024 (73%) 234 8.81e-5
VdP (t=3000) StabFB 1354 16 (1%) 117 (9%) 99 (7%) 105 (8%) 1017 (75%) 266 9.47e-5
VdP (t=6.3) Default 43 3 (7%) 22 (51%) 8 (19%) 7 (16%) 3 (7%) 0 2.38e-9
VdP (t=6.3) StabFB 48 7 (15%) 22 (46%) 11 (23%) 7 (15%) 1 (2%) 11 1.11e-8
ROBER Default 217 3 (1%) 11 (5%) 9 (4%) 7 (3%) 187 (86%) 6 8.97e-7
ROBER StabFB 223 3 (1%) 15 (7%) 10 (4%) 14 (6%) 181 (81%) 8 7.29e-7
HIRES Default 169 3 (2%) 11 (7%) 17 (10%) 18 (11%) 120 (71%) 13 4.83e-6
HIRES StabFB 176 3 (2%) 14 (8%) 19 (11%) 14 (8%) 126 (72%) 20 5.01e-6

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs standard converge tests. Lock the order and test convergence

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the tests, please review.

Test Summary: | Pass Total Time
MOOSE234 fixed-order convergence (order=2, out-of-place) | 3 3 1.0s
Test Summary: | Pass Total Time
MOOSE234 fixed-order convergence (order=2, in-place) | 3 3 1.3s
Test Summary: | Pass Total Time
MOOSE234 fixed-order convergence (order=3, out-of-place) | 3 3 0.0s
Test Summary: | Pass Total Time
MOOSE234 fixed-order convergence (order=3, in-place) | 3 3 0.0s
Test Summary: | Pass Total Time
MOOSE234 fixed-order convergence (order=4, out-of-place) | 3 3 0.0s
Test Summary: | Pass Total Time
MOOSE234 fixed-order convergence (order=4, in-place) | 3 3 0.0s

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this file separate from the others?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this in the latest upload

@@ -41,6 +41,11 @@ end
nlsolvefail(nlsolver) && return
u = nlsolver.tmp + z

if alg.time_filter && integrator.success_iter > 0

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs a rebase.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rebased and resolved the conflicts. Please review

…lter

MOOSE234: Variable-stepsize, variable-order (2/3/4) implicit method based
on DeCaria et al. (arXiv:1810.06670v1). Performs a single BDF3 nonlinear
solve per step with two cheap time-filter post-processing steps to produce
embedded solutions at orders 2 and 4.

FBDF stab_filter: Optional BDF3-Stab correction applied when FBDF drops to
order 2 (off by default, enable with `FBDF(stab_filter=true)`).

ImplicitEuler time_filter: Optional 2nd-order time-filter correction that
raises effective order from 1 to 2 (off by default, enable with
`ImplicitEuler(time_filter=true)`).
@KeshavVenkatesh

KeshavVenkatesh commented May 31, 2026

Copy link
Copy Markdown
Contributor Author

I had to resolve several conflicts with the rebase operation. I have uploaded the latest changes with refreshing to master for these changes

  1. MOOSE234: a variable-stepsize, variable-order (2/3/4) , Performs a single BDF3 nonlinear solve per step, then applies two cheap linear time-filter post-processing steps to produce embedded solutions at orders 2 and 4.
  2. Add optional stab_filter kwarg to FBDF that applies BDF3-Stab correction when order drops to 2.

@oscardssmith @ChrisRackauckas : Please review and let me know how you want to proceed. It is not very clear if we should merge one or both of these changes.

@ChrisRackauckas

Copy link
Copy Markdown
Member

Add optional stab_filter kwarg to FBDF that applies BDF3-Stab correction when order drops to 2.

You mean when order is 3?

The BDF3-Stab filter requires a BDF3 solve as input, so it should activate
when k==3 (not k==2). Also override EEst with the embedded difference
(y² - y³) when the filter fires, giving the step size controller a correct
error magnitude without changing cache.order (which controls the next solve).
@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

Add optional stab_filter kwarg to FBDF that applies BDF3-Stab correction when order drops to 2.

You mean when order is 3?

Yes

@KeshavVenkatesh

Copy link
Copy Markdown
Contributor Author

Benchmarking the BDF3-Stab filter on FBDF (both at order 2 and order 3, with and without proper error estimation), I don't see a clear performance benefit. I'll continue investigating whether there are specific problem classes where it helps, but for now we could exclude it from PR.

Do we see MOOSE234 as ready to merge, or are there open questions about its value proposition that we should measure more carefully?

@ChrisRackauckas

Copy link
Copy Markdown
Member

Okay then yes split Moose

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants