Use device_unmap in destructors#2391
Use device_unmap in destructors#2391timofeymukha wants to merge 10 commits intoExtremeFLOW:developfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Propagates use of device_unmap in destructors/free routines across multiple Fortran modules to unify host/device memory teardown and reduce manual device_free/device_deassociate handling.
Changes:
- Replaced many
device_free(and somedevice_deassociate) calls withdevice_unmapin destructors. - Moved unmapping logic to occur alongside host deallocation (
allocated(...)-guarded blocks). - Simplified module imports by dropping now-unused
c_associatedin several files.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/wall_models/wall_model.f90 | Unmaps device mappings for index arrays during base destructor cleanup. |
| src/time_schemes/time_scheme_controller.f90 | Uses device_unmap for mapped coefficient arrays in destructor. |
| src/time_schemes/runge_kutta_scheme.f90 | Unmaps RK coefficient arrays before deallocation. |
| src/source_terms/gradient_jump_penalty.f90 | Converts extensive device cleanup from device_free to device_unmap tied to host allocations. |
| src/simulation_components/probes.F90 | Switches probe output buffers to device_unmap per-field mapping during teardown. |
| src/sem/space.f90 | Unmaps many SEM space arrays before deallocation, removing separate device-free section. |
| src/sem/local_interpolation.f90 | Unmaps interpolation weights before deallocation. |
| src/sem/interpolation.f90 | Unmaps interpolation matrices before deallocation, removing standalone device-free block. |
| src/sem/dofmap.f90 | Unmaps coordinate arrays before deallocation, removing standalone device-free block. |
| src/sem/coef.f90 | Unmaps many coefficient/device-mapped arrays before deallocation, removing large standalone device-free block. |
| src/math/vector.f90 | Uses device_unmap to replace device_deassociate + device_free in vector destructor. |
| src/math/schwarz.f90 | Unmaps mapped work buffers; keeps device_free for device-only allocations. |
| src/math/matrix.f90 | Uses device_unmap to replace device_deassociate + device_free in matrix destructor. |
| src/math/fdm.f90 | Unmaps mapped arrays before deallocation, removing standalone device-free block. |
| src/global_interpolation/global_interpolation.f90 | Unmaps mapped owner arrays and replaces manual unmap/free sequence with device_unmap. |
| src/fluid/bcknd/advection/adv_oifs.f90 | Unmaps mapped arrays before deallocation, removing standalone device-free block. |
| src/fluid/bcknd/advection/adv_no_dealias.f90 | Unmaps mapped temp buffer before deallocation. |
| src/fluid/bcknd/advection/adv_dealias.f90 | Unmaps mapped scratch arrays before deallocation, removing standalone device-free block. |
| src/filter/elementwise_filter.f90 | Unmaps mapped filter matrices before deallocation, removing standalone device-free block. |
|
Wouldn't it be a bit more safe if we still do the test with |
From what I can see, all the updates here mirror the initializers. So I think we should not add the extra |
Fully agree that we do mirror init and in principle it will be fine. But without checking the content of the pointer, we don't know if something bad has happened during the run. E.g. some consumer has deallocated an array etc (such an error will be a pain to find (now we just silently ignores it, which is probably also not the best thing)) |
Which is why I argue that we move it outside the |
|
My master plan to settle everything in yesterday's small PR has clearly failed :-). It's true that if someone deallocates the host array randomly, the device pointer will leak. But passing an unallocated array to a dummy argument which is not itself allocatable is illegal, or? I guess we have to have an else if clause where we check the c_associate on x_d and free that if associated. I would not put that stuff in unmap, it should just reverse a valid map. So in the end we will have about the same verbosity level as for just adding a dissassociate directly and not using unmap, but unmap gives us extra validity checks. To be honest, in 90% of these types, one should just use vector_t and matrix_t instead of storing x and x_d. Then, it will just be a matter of calling vector_t%free, and we can be very verbose and careful in that destructor. |
True, my understanding of the standard is that passing an unallocated to a not (marked) allocatable dummy argument should be illegal (but a language lawyer might be needed decide on that)
Agree, check and then call unmap instead of a foolproof unmap, like a free in C, it would accept anything ;)
In an ideal world yes... but with the current state of compilers, maybe not. These structures prevents optimisation (sigh), unless do concurrent or we sprinkle pragmas everywhere (yeah I know it's shouldn't be needed but it is sadly) |
Propagates the use of device_unmap through the codebase.