Add a numerical mixing diagnostic to MOM6 by jbisits · Pull Request #57 · ACCESS-NRI/MOM6

jbisits · 2026-06-02T00:58:27Z

This PR adds a diagnostic that can track the variance production over the transport timestep, which is related to the numerical mixing of tracers, of any registered tracer in the MOM6 tracer directory. The diagnostic is:

$$ \begin{eqnarray} %&&\text{{Advection scheme variance production in 2D/vertical ALE cell $i,j$}} =\nonumber &&\chi_{i, j}^{\tau} = \bigg[\frac{hc^2}{\Delta t}\bigg]_{i,j,\tau}^{i,j,\tau + 1} + \frac{1}{\Delta_{i,j}A} \left( \bigg[ 2f^xc - Uc^2 \bigg]_{i-\frac{1}{2},j,\tau}^{i+\frac{1}{2},j,\tau} + \bigg[ 2f^yc - Vc^2 \bigg]_{i,j-\frac{1}{2},\tau}^{i,j+\frac{1}{2},\tau} \right) \end{eqnarray} $$

We have a manuscript that details the derivation and uses the diagnostic in ANU-TUB that I think we can share if needed.

The diagnostic (equation above) is computed in src/tracer/MOM_tracer_numerical_mixing.F90 from prognostic and diagnostic variables available in MOM6. The terms in the equation above are:

$c$: tracer concentration (any tracer that is registered in tracer registry)
$h$: thickness at previous and current timestep;
$\Delta t$: transport timestep
$1 / \Delta A_{i, j}$: inverse grid cell area;
$f^{x}$: diagnostic x advective flux;
$U$: zonal mass transport;
$f^{y}$: diagnostic y advective flux;
$V$: meridional mass transport.

Upwind value are used for concentration on the grid cell faces. Please let me know what other information I should add here!

I am baffled and need some help with the failing thickness test. The diagnostic is split into two subroutines: thickness_weighted_variance_advection and thickness_weighted_variance_flux_divergence. When both subroutines are used the CI thickness test fails on the OBC test but if I run CI on the subroutines one at a time all tests pass; see CI only with variance flux subroutine and CI only with variance advection subroutine). Any help with this would be great!

Closes ACCESS-NRI/access-om3-configs#897

cc @janzika

jbisits · 2026-06-02T00:59:54Z

I need to bring this branch up to date with the current ACCESS release which I will do before this is ready for review. This should reduce the total number of changes as the only intended addition of this PR is the numerical mixing diagnostic. But because I forked from the mom6-ocean fork I have conflicts.

dougiesquire · 2026-06-02T01:04:40Z

@jbisits let me know if you'd like any help with rebasing

jbisits · 2026-06-02T01:14:11Z

@jbisits let me know if you'd like any help with rebasing

Yeah that would be great! I am happy to try first but if you have time I am sure you will be able to do it much faster! I can add a list of file that I have changed for the new diagnostic if that helps?

dougiesquire · 2026-06-02T02:24:47Z

I suspect rebasing will fix the failing tests - it looks like you might've incorrectly resolved a few merge conflicts along the way. I'll have a stab today.

…e uhtr not good choice)

jbisits · 2026-06-02T03:19:18Z

I suspect rebasing will fix the failing tests - it looks like you might've incorrectly resolved a few merge conflicts along the way. I'll have a stab today.

Great, thanks @dougiesquire! Unfortunately I do recall when I merged my fork with ACCESS-NRI I had a lot of conflicts and there were quite a few that I did not look at too carefully. My apologies! Let me know if there is anything I can do to help.

dougiesquire · 2026-06-02T03:28:17Z

That was a lot of commits and conflicts 😅. Unfortunately, it looks like the test-dim (h, vertical thickness) tests are still failing.

Could you please check that this looks like the full set of changes? Once you've confirmed, I'll force push to this branch

jbisits · 2026-06-02T03:37:13Z

That was a lot of commits and conflicts 😅. Unfortunately, it looks like the test-dim (h, vertical thickness) tests are still failing.

That's unfortunate.

Could you please check that this looks like the full set of changes? Once you've confirmed, I'll force push to this branch

Yep these look like all the changes, thanks!

dougiesquire · 2026-06-02T04:11:51Z

I've force pushed @jbisits. You'll probably want to update your local branch. Something like:

git fetch access-nri
git reset --hard access-nri/jib/numerical-mixing

jbisits · 2026-06-02T04:17:23Z

Thanks for all that work @dougiesquire! I have tried many times to fix the thickness test and always come up short. I also have not quite been able to figure out how the test works so I can run locally. Do you think it would be worth coming to the next TWG working group meeting and showing where this is all at? Maybe someone has encountered something like this before.

dougiesquire · 2026-06-03T01:33:01Z

I've tried to look into the failing test and I am also pretty baffled.

It's failing the thickness dimensional rescaling test for tc3. This runs the .testing/tc3 configuration with H_RESCALE_POWER=1 and H_RESCALE_POWER=11 and expects the same results.

The test fails because the remapped z-level _advection_scheme_variance_production diagnostics for all of the tr_D1-tr_D11 DOME tracers differ very slightly at every time step except for the first, e.g.:

...
- h-point: c=     24416 ocean_model_z-tr_D1_advection_scheme_variance_production 0 240
+ h-point: c=     24400 ocean_model_z-tr_D1_advection_scheme_variance_production 0 240
...
- h-point: c=     24416 ocean_model_z-tr_D2_advection_scheme_variance_production 0 240
+ h-point: c=     24400 ocean_model_z-tr_D2_advection_scheme_variance_production 0 240
...

The native diagnostics are the same between the two runs; it's only the remapped diagnostics that are different.

The magnitudes of the *_advection_scheme_variance_production diagnostics are very small in this configuration (somewhere around e-50). I think this is because there's effectively zero tracer advection. My guess is that what we're seeing is floating-point rounding differences in the z-level remapping.

I've output the tr_D1_advection_scheme_variance_production diagnostic on z-levels at double precision at every timestep to see where the differences are. These are the differences at every depth level at the final time step (note the tiny magnitudes):

I'm not really sure what to do about this at this point. The fact that the native diagnostics pass the test indicates that the computation of asvp is h-scaling-invariant, so I don't think reformulating the diagnostic can help. The issue seems to be the z-level remapping of near-zero values. Perhaps @marshallward has a suggestion?

jbisits · 2026-06-03T02:17:18Z

Thanks for the investigative work @dougiesquire!

The native diagnostics are the same between the two runs, it's only the remapped diagnostics that are different.

That's good!

I've output the diagnostics at double precision at every timestep to see where the differences are. These are the differences at every depth level at the final time step (note the tiny magnitudes):

For me to double check, this figure is the difference between z-tr_D1_advection_scheme_variance_production and z-tr_D2_advection_scheme_variance_production, where D1=1 and D2=11, at each depth level? And from the figure it is not every x and y point in the domain that there is error, only some? And at greater depths there is none?

I think this is because there's effectively zero tracer advection. My guess is that what we're seeing is floating-point rounding differences in the z-level remapping.

I don't know much about the diagnostic remapping but I am keen to learn more. Extremely unlikely I will find anything you have not done, or know, already though.

dougiesquire · 2026-06-03T02:27:19Z

For me to double check, this figure is the difference between z-tr_D1_advection_scheme_variance_production and z-tr_D2_advection_scheme_variance_production, where D1=1 and D2=11, at each depth level? And from the figure it is not every x and y point in the domain that there is error, only some? And at greater depths there is none?

Sorry, my comment was a bit unclear (now edited). The figure shows the difference in the tr_D1_advection_scheme_variance_production diagnostic on z-levels between the runs with H_RESCALE_POWER=1 and H_RESCALE_POWER=11. The other tracers (tr_D2 etc) show similar differences. Yeah, most of the points are the same (white). Some depths show no differences, but the location of the differences changes at each timestep, e.g. this is two timesteps prior:

jbisits · 2026-06-03T03:26:11Z

For me to double check, this figure is the difference between z-tr_D1_advection_scheme_variance_production and z-tr_D2_advection_scheme_variance_production, where D1=1 and D2=11, at each depth level? And from the figure it is not every x and y point in the domain that there is error, only some? And at greater depths there is none?

Sorry, my comment was a bit unclear (now edited). The figure shows the difference in the tr_D1_advection_scheme_variance_production diagnostic on z-levels between the runs with H_RESCALE_POWER=1 and H_RESCALE_POWER=11. The other tracers (tr_D2 etc) show similar differences. Yeah, most of the points are the same (white). Some depths show no differences, but the location of the differences changes at each timestep, e.g. this is two timesteps prior:

Cool, actually was me not understanding but all cleared up now!

marshallward · 2026-06-03T12:28:29Z

I don't have much time to have a look this week, but could it be a H vs Z mixup somewhere?

jbisits · 2026-06-04T04:09:24Z

Would a H and Z mixup be more likely internally or when outputting?

Here is the internal dimensional analysis of the computation in the subroutines:

thickness_weighted_variance_advection computation:

Ih = 1 / h(i,j,k)                                                              !< [H-1]
ht_prev = h_diag(i,j,k) * Tr%t_prev(i,j,k)                                     !< [H]*[CU] = [H CU] 
ht_adv = ht_prev + dt * Tr%advection_xy(i,j,k)                                 !< [H CU] + [T]*[CU H T-1] = [H CU]
asvp(i,j,k) = ( (Ih * (ht_adv*ht_adv)) - (ht_prev * Tr%t_prev(i,j,k)) ) * Idt  !< ([H-1]*[H2 CU2] - [H CU]*[CU]) * [T-1] = [CU H T-1]

thickness_weighted_variance_flux_divergence computation:

 east = (2 * (Tr%ad_x(I,j,k)  *x_upwind(I,j,k)))   - ((Idt*uhtr(I,j,k))   * (x_upwind(I,j,k)  *x_upwind(I,j,k)))    !< [CU H L2 T-1] * [CU] - [T-1]*[H L2]*[CU]*[CU] = [CU2 H L2 T-1]
 west = (2 * (Tr%ad_x(I-1,j,k)*x_upwind(I-1,j,k))) - ((Idt*uhtr(I-1,j,k)) * (x_upwind(I-1,j,k)*x_upwind(I-1,j,k)))  !< [CU H L2 T-1] * [CU] - [T-1]*[H L2]*[CU]*[CU] = [CU2 H L2 T-1]
north = (2 * (Tr%ad_y(i,J,k)  *y_upwind(i,J,k)))   - ((Idt*vhtr(i,J,k))   * (y_upwind(i,J,k)  *y_upwind(i,J,k)))    !< [CU H L2 T-1] * [CU] - [T-1]*[H L2]*[CU]*[CU] = [CU2 H L2 T-1]
south = (2 * (Tr%ad_y(i,J-1,k)*y_upwind(i,J-1,k))) - ((Idt*vhtr(i,J-1,k)) * (y_upwind(i,J-1,k)*y_upwind(i,J-1,k)))  !< [CU H L2 T-1] * [CU] - [T-1]*[H L2]*[CU]*[CU] = [CU2 H L2 T-1]
asvp(i,j,k) = asvp(i,j,k) + (((east - west) + (north - south)) * G%IareaT(i,j)) !< [CU2 H L2 T-1] * [L-2] = [CU2 H T-1]

And the conversion when outputting:

Tr%id_advection_scheme_variance_production = register_diag_field("ocean_model", &
    trim(shortnm)//"_advection_scheme_variance_production", diag%axesTL, Time, &
     "Spurious variance production of "//trim(shortnm)//" variance due to advection", &
     trim(Tr%units)//"2 m s-1", conversion=(TR%conc_scale**2)*GV%H_to_MKS*US%s_to_T)

marshallward · 2026-06-04T10:41:55Z

tc3 is based on a zero-flow open boundary test. Looking at OBC code might help to track it down.

But the way to do these is to turn on Debug = True and diff through the output. You might need to add extra [huvB]chksum() functions to narrow it down.

If there's no urgency here then I can also give it a try next week.

jbisits · 2026-06-05T00:42:50Z

tc3 is based on a zero-flow open boundary test. Looking at OBC code might help to track it down.

But the way to do these is to turn on Debug = True and diff through the output. You might need to add extra [huvB]chksum() functions to narrow it down.

I have the tests up and running locally so I will see if I can make any progress on this.

If there's no urgency here then I can also give it a try next week.

No urgency here; would be great if you could take a look next week!

jbisits · 2026-06-05T04:49:56Z

I ran with Debug = True and the only diff between work/symmetric/std.err and work/dim.h/str.err was:

diff symmetric/std.err dim.h/std.err
5651c5651
<                                Memuse(MB) at Memory HiWaterMark=  4.502E+01  4.502E+01  0.000E+00  4.502E+01
---
>                                Memuse(MB) at Memory HiWaterMark=  4.505E+01  4.505E+01  0.000E+00  4.505E+01

Does this indicate there was no difference between the model runs (apart from memory usage)? If you get some time to take a look next week @marshallward that would be great!

marshallward · 2026-06-10T17:27:02Z

I was able to reproduce it on my laptop:

cd .testing
make -j tc3.dim.h.diag

and the error is in ocean_model_z-tr_D3_advection_scheme_variance_production.

DEBUG = True unfortunately doesn't have much to add, since there are (understandably) very few debug checksums in your new module.

I added some chksum calls for asvp, and the min and max do seem to scale like H, so that part is likely correct.

But I also noticed is that these numbers are very small.

h-point: mean=   0.0000000000000000E+00 min=   1.4165571386305437E-59 max=   1.1872909495924985E-43 Tr nummix after advect: asvp
h-point: c=     25200 Tr nummix after advect: asvp                              
h-point: mean=   0.0000000000000000E+00 min=   1.4165571386305437E-59 max=   1.1872909495924985E-43 Tr nummix after div: asvp
h-point: c=     25200 Tr nummix after div: asvp

The min and max are 1e-43 to 1e-60. The mean is reported as zero, because most of these values are too small to be recorded in our reproducing_sum() subroutine.

Also, if I reduce H_RESCALE_POWER to 6 (i.e. rescale H by 2**6), then it passes. If I increase to 7, then it fails.

So the short answer is that your numbers are too small to apply dimensional scaling.

There are a few things you can do here, perhaps add a "floor" function that zeros out numbers below a certain value. Or perhaps there is something about your calculation that is capturing residuals in a strange way.

@Hallberg-NOAA has encountered this before, so he might be a better person to comment.

jbisits · 2026-06-10T23:08:01Z

Thanks for taking a look @marshallward!

There are a few things you can do here, perhaps add a "floor" function that zeros out numbers below a certain value.

I can add this but will just need to choose the value below which to zero below. The maximum value in your example above may work for these tests but perhaps there is more appropriate value to choose (though I do not know what that might be!).

@Hallberg-NOAA if you have other suggestions or comments please let me know!

Thanks again for the help on this!

dougiesquire · 2026-06-10T23:23:08Z

Thanks for looking @marshallward

So the short answer is that your numbers are too small to apply dimensional scaling.

Just to state the obvious: As I mentioned here, the native diagnostics (e.g. ocean_model-tr_D1_advection_scheme_variance_production) are obviously equally small and do pass dimensional rescaling, even with H_RESCALE_POWER=11. It's only the remapped z-level diagnostics that fail.

There are a few things you can do here, perhaps add a "floor" function that zeros out numbers below a certain value. Or perhaps there is something about your calculation that is capturing residuals in a strange way.

I'm not really across the diagnostic but it may effectively be defined as a residual. If there were no numerical mixing, the two terms cancel? That at least seems to be the case in the failing configuration - e.g. these are the chksums for each term:

h-point: mean=   0.0000000000000000E+00 min=  -4.6008012880328669E-43 max=   2.3705811131449731E-42 ocean_model_z-tr_D1_thickness_weighted_variance_advection 0 240
h-point: c=     25652 ocean_model_z-tr_D1_thickness_weighted_variance_advection 0 240

h-point: mean=   0.0000000000000000E+00 min=  -2.3705811131449811E-42 max=   4.6008012880327427E-43 ocean_model_z-tr_D1_thickness_weighted_variance_flux_divergence 0 240
h-point: c=     25628 ocean_model_z-tr_D1_thickness_weighted_variance_flux_divergence 0 240

I guess this makes it particularly prone to this kind of issue.

Good to know that flooring is an option. I wasn't sure that MOM6 maintainers would be happy with such an approach.

jbisits · 2026-06-10T23:48:03Z

I'm not really across the diagnostic but it may effectively be defined as a residual. If there were no numerical mixing, the two terms cancel?

If the two terms cancelled there would be no variance production from the advection scheme i.e. the advection scheme would conserve the second moment of tracers (which they tend not to unless they are specifically designed to). So you are right @dougiesquire it can be thought of as a residual for tracer variance over the advection timestep. Apologies if this is an obvious restatement of your question!

That at least seems to be the case in the failing configuration - e.g. these are the chksums for each term:

h-point: mean=   0.0000000000000000E+00 min=  -4.6008012880328669E-43 max=   2.3705811131449731E-42 ocean_model_z-tr_D1_thickness_weighted_variance_advection 0 240
h-point: c=     25652 ocean_model_z-tr_D1_thickness_weighted_variance_advection 0 240

h-point: mean=   0.0000000000000000E+00 min=  -2.3705811131449811E-42 max=   4.6008012880327427E-43 ocean_model_z-tr_D1_thickness_weighted_variance_flux_divergence 0 240
h-point: c=     25628 ocean_model_z-tr_D1_thickness_weighted_variance_flux_divergence 0 240

Ah, this explains when I ran CI on the subroutines separately they passed; see above. Is summing these two very small components where you thought the numerical error might be coming from in the remapped z case? Or is it in the bit count that these numbers might be slightly different but very close?

marshallward · 2026-06-11T13:09:50Z

Good to know that flooring is an option. I wasn't sure that MOM6 maintainers would be happy with such an approach.

I think this has been done before, but I probably would only do it as a last resort.

I wonder if this operation should be handled inside of chksum rather than the solvers. But this is something that @Hallberg-NOAA should address. (He is on leave this week but should be back soon.)

(Also apologies @dougiesquire for not reading your earlier message more carefully about the smallness of the values.)

jbisits added 26 commits June 2, 2026 12:38

Code cleanup

010e266

Semicolons not commas, correct order for subroutine call

5bf3c2d

Use uhtr*Idt insteatd of umo/rho0 (same for v)

361d933

Missed reference density

7d80f8b

Hard code reference density (might have caused the netcdf error)

19c8b33

Try thickness weighted variance only (netcdf error occured again mayb…

6c3b60b

…e uhtr not good choice)

Use reference density from GV

30aefa6

Remove old unit designator

0be62ab

Remove unnecessary definition of reference density

e2dcc9d

Only zonal fluxes + try scaling the upwind calculation

bb846a0

Use locally defined and - would be ideal if I could find where they are

b411d30

Set upwind variables to zero

be86542

Homogenise upwind value definition

667dee8

More specific bracketing in thickness weighted variance change

d2930d3

Only return upwind difference

53ef6c8

Check non-upwind part of zonal fluxes calculation

9722ba2

Just look at left upwind i.e. the left tracer values

2eeb3c9

west upwind values

d2aa63f

Back to east upwind values

c1ae7b0

Return the upwind array

0e73593

Make nm same size as upwind for saving

89cc31b

Return nm array to correct size

42d1e8a

Loop over vertices

0345cce

Zonal fluxes with vertex indexing

c8a90e0

Change index to start 1 prior

b0e27a8

Temporary resize to save upwind tracer array

8e71d9f

jbisits added 9 commits June 2, 2026 13:18

More white space to be removed

7f2dd57

Rename diagnostic to advection_scheme_variance_production

2d19f13

Forgot tracer registry

5b5c4ab

Use single subroutine for variance flux + tidy up docstring spacing

5b7e628

res -> asvp, remove last **2 opertation, change order of subroutines

5853822

Forgot to add value to existing array

c352fc1

Return to original order + CI only on variance flux

816b830

CI only on variance advection

eabf510

Include both terms

9b1dd04

dougiesquire force-pushed the jib/numerical-mixing branch from 5386bbb to 9b1dd04 Compare June 2, 2026 04:08

jbisits mentioned this pull request Jun 9, 2026

Diagnosing numerical mixing in ACCESS-OM3 models ACCESS-Community-Hub/access-om3-paper-1#103

Open

Conversation

jbisits commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbisits commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dougiesquire commented Jun 2, 2026

Uh oh!

jbisits commented Jun 2, 2026

Uh oh!

dougiesquire commented Jun 2, 2026

Uh oh!

jbisits commented Jun 2, 2026

Uh oh!

dougiesquire commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbisits commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dougiesquire commented Jun 2, 2026

Uh oh!

jbisits commented Jun 2, 2026

Uh oh!

dougiesquire commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbisits commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dougiesquire commented Jun 3, 2026

Uh oh!

jbisits commented Jun 3, 2026

Uh oh!

marshallward commented Jun 3, 2026

Uh oh!

jbisits commented Jun 4, 2026

Uh oh!

marshallward commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbisits commented Jun 5, 2026

Uh oh!

jbisits commented Jun 5, 2026

Uh oh!

marshallward commented Jun 10, 2026

Uh oh!

jbisits commented Jun 10, 2026

Uh oh!

dougiesquire commented Jun 10, 2026

Uh oh!

jbisits commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marshallward commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jbisits commented Jun 2, 2026 •

edited

Loading

jbisits commented Jun 2, 2026 •

edited

Loading

dougiesquire commented Jun 2, 2026 •

edited

Loading

jbisits commented Jun 2, 2026 •

edited

Loading

dougiesquire commented Jun 3, 2026 •

edited

Loading

jbisits commented Jun 3, 2026 •

edited

Loading

marshallward commented Jun 4, 2026 •

edited

Loading

jbisits commented Jun 10, 2026 •

edited

Loading