Symmetry-based acceleration for HF and GW kernels and codebase cleanup#12
Conversation
Replace full-BZ loops with k-star/q-star iteration in accumulate_gw_selfenergy_on_device. Add cu_symmetry integration for AO-basis k-point rotation (transform_k_ao_device) and auxiliary-basis q-point rotation (U_q, 3-GEMM) in both tau contractions. Fix heap-overflow memsets, double-conjugation bug, and remove dead need_minus_k/k1 bool parameters from qkpt setup functions. X2C r0 made no-op pending full AO rotation support.
…asks in minimal number of functions
There was a problem hiding this comment.
Pull request overview
This PR introduces GPU-side exploitation of crystal symmetries for the GW self-energy (reducing full-BZ work to IBZ + star expansion) and performs a substantial CUDA codebase refactor by splitting the previous monolithic CUDA “routines” into focused HF/GW utility components plus a reusable cu_symmetry layer.
Changes:
- Add
cu_symmetry+cu_symmetry_datato perform IBZ→full-BZ AO (k) and auxiliary (q/P0) unitary transforms on GPU, and update GW to loop over irreducible q-points (inq) rather than full q. - Refactor GPU CUDA code: replace
cu_routines.{h,cu}withcuhf_utilsandcugw_utils, and splitgw_qkptinto its own compilation unit. - Add a CUDA-based symmetry transform roundtrip test and wire it into the test build.
Reviewed changes
Copilot reviewed 25 out of 38 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| test/cu_symmetry_test.h | Declares a GPU symmetry transform roundtrip test helper. |
| test/cu_symmetry_test.cu | Implements a GPU vs CPU reference symmetry transform validation routine. |
| test/cu_solver_test.cpp | Updates HDF5 key path for ink and adds a new symmetry transform test section. |
| test/CMakeLists.txt | Enables CUDA language in tests and compiles the new .cu test source. |
| src/hf_gpu_kernel.cpp | Switches HF exchange to cuhf_utils and updates direct term to use k_symmetry().value_AO/full-BZ looping. |
| src/gw_gpu_kernel.cpp | Switches to cugw_utils, builds cu_symmetry_data, moves GW outer distribution from ink to inq, and updates memory accounting. |
| src/green/gpu/hf_gpu_kernel.h | Updates brillouin_zone_utils alias usage. |
| src/green/gpu/gw_gpu_kernel.h | Updates brillouin_zone_utils alias usage and adjusts X2C copy helper signatures. |
| src/green/gpu/gpu_kernel.h | Adds q-mesh sizes (_nq, _inq) and switches k-pair counting to k_symmetry(). |
| src/green/gpu/gpu_factory.h | Updates brillouin_zone_utils alias usage. |
| src/green/gpu/df_integral_t.h | Switches k-pair symmetry bookkeeping from symmetry() to k_symmetry(). |
| src/green/gpu/cuhf_utils.h | New HF CUDA utility class header (replacement for HF parts of cu_routines). |
| src/green/gpu/cugw_utils.h | New GW CUDA utility class header integrating cu_symmetry and new callbacks. |
| src/green/gpu/cugw_qpt.h | Extends q-worker interface (constructor signature change + debug dump API). |
| src/green/gpu/cugw_qkpt.h | New per-(q,k) worker header separated from cugw_qpt.h. |
| src/green/gpu/cu_symmetry.h | New symmetry data container and GPU symmetry transform interface. |
| src/green/gpu/cu_routines.h | Removes the previous monolithic CUDA routines header. |
| src/green/gpu/cu_hf_utils.h | Leaves an empty placeholder file indicating a rename. |
| src/cuhf_utils.cu | New implementation of HF CUDA utilities (exchange accumulation). |
| src/cugw_utils.cu | New implementation of GW CUDA utilities (IBZ upload path + symmetry transforms + q-IBZ loop). |
| src/cugw_qpt.cu | Updates gw_qpt constructor signature and adds the P0 diagonal debug dump implementation. |
| src/cugw_qkpt.cu | New implementation of gw_qkpt worker split from cugw_qpt.cu. |
| src/cu_symmetry.cu | New implementation of GPU symmetry transforms and device metadata uploads. |
| src/cu_routines.cu | Removes the previous monolithic CUDA routines implementation. |
| src/CMakeLists.txt | Updates CUDA source list and links GREEN::SYMMETRY for the refactor. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
egull
left a comment
There was a problem hiding this comment.
I have some minor comment which we can skip for now. There's way too much work in this to really analyze it all without testing how it works in practice. So I propose to merge and then clean up if we see that something doesn't work.
I think none of the math operations on the GPU are perforamnce critical. If they are, we can replace the hand-made multiplications with cublas calls.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #12 +/- ##
==========================================
+ Coverage 95.19% 95.32% +0.13%
==========================================
Files 12 13 +1
Lines 832 877 +45
==========================================
+ Hits 792 836 +44
- Misses 40 41 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Merging -- with the tests passing and a cleaner code, I feel we are ready for a 1.0.0a release for GREEN. |
Implements crystal symmetry exploitation in the GPU GW self-energy kernel, reducing the k/q loop workload from full-BZ to IBZ, and refactors the GPU source tree for maintainability.
Key changes: