Skip to content

MOST wall model for ABL simulations#2385

Open
loretet wants to merge 254 commits intodevelopfrom
feature/most
Open

MOST wall model for ABL simulations#2385
loretet wants to merge 254 commits intodevelopfrom
feature/most

Conversation

@loretet
Copy link

@loretet loretet commented Mar 16, 2026

This PR implements the Monin-Obukhov Similarity Theory (MOST) wall model on CPU and GPU.

Description

MOST is the most common wall model used in atmospheric science. Here, the friction velocity $u_*$ is computed based on some empirical similarity functions which depend on the thermal stratification of the boundary layer (in this version we use the same functions as the PALM model). The WSS $\tau$ is then computed from $u_*$ in the usual way and applied as a boundary condition.
More info can be found in Stull (1988) or Rotach and Holtslag (2025).

The MOST implementation is very similar to the rough_log_law wall model, but it is made more complex by a necessary branching in the algorithm: the similarity functions and stress expression have different shapes based on whether a Dirichlet or Neumann boundary condition is used for temperature, and also based on the local stratification of the flow (which has to be computed for every timestep and point). This makes the GPU implementation not very efficient, I think (especially if it is a noob like me to code it :D)

Implementation

The MOST wall model requires more specification than rough_log_law. An example of implementation is as follows:

 "boundary_conditions": [
       {
            "type": "wall_model",
            "model": "most",
            "kappa": 0.4,     // Von Karman constant
            "z0": 0.1,     // the momentum roughness length
            "z0h": 0.1,      // the heat roughness length (if negative, Zilitinkevich formula is used)
            "type_of_temp_bc": "neumann",   // supports either Neumann or Dirichlet as b.c.
            "bottom_bc_flux_or_temp": 0.05,     // the temperature or flux value imposed at the boundary
            "zone_indices": [
                       5
             ],
            "h_index":1     // the model samples the flow at the first index above the boundary
        },
....
]

Limitations/issues

At the moment, in a typical ABL simulation it is necessary to specify multiple times in the case file the same variables (e.g. g or q). The redundancy of parameters in the case file can be reduced by using the constants group (#2228).
Another limitation is that - for now - it is only possible to use this wall model on one boundary at the time (it is unlikely that anybody will ever need it on two different boundaries, but it's still a possible future work).

Results

Here is a comparison made between the CPU and GPU implementation of the wall model in Neko. Both the rough log-law and the MOST model appear. The simulation had a geostrophic wind flow of 10 m/s and a bottom heat flux of 0.05Kms-1, used buoyancy-corrected Vreman as SGS model and has 24 elements (polynomial degree 7).
Coding-wise, the CPU and GPU implementations seem to be exactly the same for me, and I think the slight difference in the results is acceptable although maybe a bit surprising (tested on Dardel). Here, $v$ is the spanwise velocity.

uv t

Note: Nek5000 doesn't appear in the plots. This is because Nek5000 and Neko have always given slightly different results in some ABL flows, and we are still investigating this issue. However, the discrepancy within the codes should not be due to the wall model, hence the PR.

@loretet loretet added the don't merge Don't merge yet! label Mar 16, 2026
@loretet
Copy link
Author

loretet commented Mar 17, 2026

Quick update after March 17 developers' meeting:

  • The difference between CPU and GPU is concerning for Niclas. Philipp suggested investigating the influence of the time average by doing batch averages and Siavash suggested to also look at very close time average after restarting from the same file. I will do that and publish some updates
  • Tim suggested to open a separate PR/issue/discussion for the difference between CPU and GPU implementation of the rough_log_law model, which shows differences even if it is a much simpler model. Will do that as well

Copy link
Contributor

@linneahuusko linneahuusko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I didn't look much at the device files since I don't really know what to look for there, but the rest looks good. I left some minor comments and thoughts about things we should consider how to do.

real(kind=rp) :: magu, utau, normu, z0h
real(kind=rp) :: L_ob, L_upper, L_lower, L_old
real(kind=rp) :: Ri_b, f, dfdl, fd_h, L_new, L_sign
real(kind=rp), parameter :: g = 9.80665_rp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gravity should not be hardcoded here since we allow it as a parameter in the case file for other components (e.g., Boussinesq, Deardorff). I guess you are assuming the direction of gravity to be wall normal, which makes sense in our cases, but how can we generalize it? Making it truly general I think would be a lot of work, but I think we should at least let the user provide a g_vector and then put an error if it has anything other than a component in the z direction.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right, we can/should definitely generalise it. I have to include rho (and mu) as well in the wall model inputs, I will add g as well (then the user can specify them all in a constants block in the case file)

Comment on lines +282 to +301
! Print some indicative quantities (these are just point quantities: don't trust 100%)
call neko_log%section('Wall model quick look')
write(log_buf, '(A,E15.7)') 'Ri_b: ', Ri_b
call neko_log%message(trim(log_buf))
write(log_buf, '(A,E15.7)') 'L_ob: ', L_ob
call neko_log%message(trim(log_buf))
write(log_buf, '(A,E15.7)') 'utau: ', utau
call neko_log%message(trim(log_buf))
write(log_buf, '(A,E15.7)') 'magu: ', magu
call neko_log%message(trim(log_buf))
write(log_buf, '(A,E15.7)') 'ts: ', ts
call neko_log%message(trim(log_buf))
write(log_buf, '(A,E15.7)') 'ti: ', ti
call neko_log%message(trim(log_buf))
write(log_buf, '(A,E15.7)') 'q: ', q
call neko_log%message(trim(log_buf))
write(log_buf, '(A,E15.7)') 'hi: ', hi
call neko_log%message(trim(log_buf))
call neko_log%end_section()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I think the goal should be to get the mean, min and max of these variables, and also write them to a separate file

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I didn't want to include this part at first but I left it in the end so that the user can just have a very rough idea if there are weird values or NaNs. A better diagnostics is necessary for sure

u => neko_registry%get_field("u")
v => neko_registry%get_field("v")
w => neko_registry%get_field("w")
temp => neko_registry%get_field("temperature")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should allow for other names of the scalar field than "temperature"? Like in the buoyancy corrected Vreman, where we put "scalar_field" in the json file (that does add another parameter to the json but it might be good to not assume field names)

Comment on lines +305 to +314

// Zilitinkevich 1995 correlation for thermal roughness
T z0h;
if (z0h_in < 0.0) {
const T nu = 1.46e-5; // Kinematic viscosity of air [m^2/s]
const T Re_z0 = (utau * z0) / nu;
z0h = z0 * exp(-0.1 * sqrt(Re_z0));
} else {
z0h = z0h_in;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a first comment, haven't gotten to look at the entire code yet.

I really don't like the hardcoded nu here, because it is completely independent of the material properties in the simulation's case file. I think one either picks up the mu / rho from the case file (The wall model should have access to these, I think), or we just make z0h mandatory, and people have to use whatever correlation they want, but outside Neko.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree, consider it some kind of placeholder for now: I want to include rho in the wall model computation and I will include mu as well (rho HAS to be there so it's a natural step to add mu as well). Thanks for spotting it!

@timofeymukha
Copy link
Collaborator

@loretet I'm going to order a copilot review now. It will probably spit out a lot of comments, and for now you can just focus on those related to the documentation. Like, it will hopefully pick up that the copyright year is wrong in some files, and similar stuff. For "tougher" comments, read them if you want, or you can wait for me to triage them first.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new Monin–Obukhov Similarity Theory (MOST) wall model to Neko, including both CPU and GPU (CUDA/HIP) implementations, and wires it into the wall-model factory and build system.

Changes:

  • Add most wall model type with JSON-driven configuration and compute path selecting CPU vs device backend.
  • Add CPU kernel and CUDA/HIP device kernels + wrappers for MOST.
  • Register most in the wall model factory and include new sources/headers in the build/dependency lists.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 24 comments.

Show a summary per file
File Description
src/wall_models/wall_model_fctry.f90 Registers most type in the wall-model factory.
src/wall_models/wall_model.f90 Indentation-only change in wall_model_find_points.
src/wall_models/most.f90 New MOST wall model type, JSON parsing, compute dispatch, lifecycle methods.
src/wall_models/bcknd/cpu/most_cpu.f90 New CPU implementation of MOST wall stress computation.
src/wall_models/bcknd/device/most_device.F90 Fortran-side device wrapper and C-bind interfaces for MOST.
src/wall_models/bcknd/device/hip/most_kernel.h New HIP kernel header implementing MOST logic.
src/wall_models/bcknd/device/hip/most.hip HIP launcher that selects kernel variant based on bc_type.
src/wall_models/bcknd/device/cuda/most_kernel.h New CUDA kernel header implementing MOST logic.
src/wall_models/bcknd/device/cuda/most.cu CUDA launcher that selects kernel variant based on bc_type.
src/wall_models/bcknd/device/hip/rough_log_law_kernel.h Adds licence header.
src/wall_models/bcknd/device/cuda/rough_log_law_kernel.h Adds licence header; adjusts a comment (currently incorrect).
src/Makefile.am Adds MOST sources to CPU + HIP/CUDA build lists and EXTRA_DIST.
src/.depends Updates Fortran dependency list for new MOST modules.
src/.depends_device Updates device dependency list for new MOST CUDA/HIP objects.

type(c_ptr), value :: u_d, v_d, w_d, temp_d
type(c_ptr), value :: ind_r_d, ind_s_d, ind_t_d, ind_e_d
type(c_ptr), value :: n_x_d, n_y_d, n_z_d, h_d
type(c_ptr) :: bc_type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also below. Let's avoid all of this by not passing a string to device kernels.

POSSIBILITY OF SUCH DAMAGE.
*/

#include <hip/hip_runtime.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, but will be resolved if we don't use strings for bc_type

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

don't merge Don't merge yet! enhancement New feature or request

Projects

Status: 📋 Todo

Development

Successfully merging this pull request may close these issues.