Skip to content

Symmetry-based acceleration for HF and GW kernels and codebase cleanup#12

Merged
gauravharsha merged 5 commits into
mainfrom
symmetry
May 9, 2026
Merged

Symmetry-based acceleration for HF and GW kernels and codebase cleanup#12
gauravharsha merged 5 commits into
mainfrom
symmetry

Conversation

@gauravharsha
Copy link
Copy Markdown
Contributor

Implements crystal symmetry exploitation in the GPU GW self-energy kernel, reducing the k/q loop workload from full-BZ to IBZ, and refactors the GPU source tree for maintainability.

Key changes:

  • cu_symmetry: new class for device-side k-AO and q-P0 unitary transforms; IBZ→full-BZ rotation of G and P0 on GPU
  • Low-memory scalar path: race-free IBZ G upload via dedicated stream + pinned staging buffer
  • gw_qpt / gw_qkpt separated into own files; cu_routines replaced by cuhf_utils + cugw_utils
  • Memory accounting in GW_check_devices_free_space updated for new allocations
  • set_up_qkpt_first/second renamed to upload_p0_inputs/upload_sigma_inputs/upload_p0_coulomb

Replace full-BZ loops with k-star/q-star iteration in accumulate_gw_selfenergy_on_device. Add cu_symmetry integration for AO-basis k-point rotation (transform_k_ao_device) and auxiliary-basis q-point rotation (U_q, 3-GEMM) in both tau contractions. Fix heap-overflow memsets, double-conjugation bug, and remove dead need_minus_k/k1 bool parameters from qkpt setup functions. X2C r0 made no-op pending full AO rotation support.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces GPU-side exploitation of crystal symmetries for the GW self-energy (reducing full-BZ work to IBZ + star expansion) and performs a substantial CUDA codebase refactor by splitting the previous monolithic CUDA “routines” into focused HF/GW utility components plus a reusable cu_symmetry layer.

Changes:

  • Add cu_symmetry + cu_symmetry_data to perform IBZ→full-BZ AO (k) and auxiliary (q/P0) unitary transforms on GPU, and update GW to loop over irreducible q-points (inq) rather than full q.
  • Refactor GPU CUDA code: replace cu_routines.{h,cu} with cuhf_utils and cugw_utils, and split gw_qkpt into its own compilation unit.
  • Add a CUDA-based symmetry transform roundtrip test and wire it into the test build.

Reviewed changes

Copilot reviewed 25 out of 38 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
test/cu_symmetry_test.h Declares a GPU symmetry transform roundtrip test helper.
test/cu_symmetry_test.cu Implements a GPU vs CPU reference symmetry transform validation routine.
test/cu_solver_test.cpp Updates HDF5 key path for ink and adds a new symmetry transform test section.
test/CMakeLists.txt Enables CUDA language in tests and compiles the new .cu test source.
src/hf_gpu_kernel.cpp Switches HF exchange to cuhf_utils and updates direct term to use k_symmetry().value_AO/full-BZ looping.
src/gw_gpu_kernel.cpp Switches to cugw_utils, builds cu_symmetry_data, moves GW outer distribution from ink to inq, and updates memory accounting.
src/green/gpu/hf_gpu_kernel.h Updates brillouin_zone_utils alias usage.
src/green/gpu/gw_gpu_kernel.h Updates brillouin_zone_utils alias usage and adjusts X2C copy helper signatures.
src/green/gpu/gpu_kernel.h Adds q-mesh sizes (_nq, _inq) and switches k-pair counting to k_symmetry().
src/green/gpu/gpu_factory.h Updates brillouin_zone_utils alias usage.
src/green/gpu/df_integral_t.h Switches k-pair symmetry bookkeeping from symmetry() to k_symmetry().
src/green/gpu/cuhf_utils.h New HF CUDA utility class header (replacement for HF parts of cu_routines).
src/green/gpu/cugw_utils.h New GW CUDA utility class header integrating cu_symmetry and new callbacks.
src/green/gpu/cugw_qpt.h Extends q-worker interface (constructor signature change + debug dump API).
src/green/gpu/cugw_qkpt.h New per-(q,k) worker header separated from cugw_qpt.h.
src/green/gpu/cu_symmetry.h New symmetry data container and GPU symmetry transform interface.
src/green/gpu/cu_routines.h Removes the previous monolithic CUDA routines header.
src/green/gpu/cu_hf_utils.h Leaves an empty placeholder file indicating a rename.
src/cuhf_utils.cu New implementation of HF CUDA utilities (exchange accumulation).
src/cugw_utils.cu New implementation of GW CUDA utilities (IBZ upload path + symmetry transforms + q-IBZ loop).
src/cugw_qpt.cu Updates gw_qpt constructor signature and adds the P0 diagonal debug dump implementation.
src/cugw_qkpt.cu New implementation of gw_qkpt worker split from cugw_qpt.cu.
src/cu_symmetry.cu New implementation of GPU symmetry transforms and device metadata uploads.
src/cu_routines.cu Removes the previous monolithic CUDA routines implementation.
src/CMakeLists.txt Updates CUDA source list and links GREEN::SYMMETRY for the refactor.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cu_symmetry.cu
Comment thread src/cu_symmetry.cu Outdated
Comment thread src/green/gpu/cu_symmetry.h Outdated
Comment thread src/gw_gpu_kernel.cpp
Comment thread src/green/gpu/cugw_qpt.h Outdated
Comment thread test/cu_solver_test.cpp Outdated
Comment thread src/hf_gpu_kernel.cpp
Comment thread src/cugw_qkpt.cu
Comment thread src/cugw_utils.cu Outdated
Copy link
Copy Markdown
Contributor

@egull egull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some minor comment which we can skip for now. There's way too much work in this to really analyze it all without testing how it works in practice. So I propose to merge and then clean up if we see that something doesn't work.
I think none of the math operations on the GPU are perforamnce critical. If they are, we can replace the hand-made multiplications with cublas calls.

Comment thread src/green/gpu/cu_hf_utils.h Outdated
Comment thread src/green/gpu/cu_symmetry.h Outdated
Comment thread src/green/gpu/cu_symmetry.h Outdated
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.96970% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.32%. Comparing base (fae769b) to head (378e813).

Files with missing lines Patch % Lines
src/gw_gpu_kernel.cpp 95.76% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #12      +/-   ##
==========================================
+ Coverage   95.19%   95.32%   +0.13%     
==========================================
  Files          12       13       +1     
  Lines         832      877      +45     
==========================================
+ Hits          792      836      +44     
- Misses         40       41       +1     
Flag Coverage Δ
unittests 95.32% <96.96%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gauravharsha
Copy link
Copy Markdown
Contributor Author

Merging -- with the tests passing and a cleaner code, I feel we are ready for a 1.0.0a release for GREEN.

@gauravharsha gauravharsha merged commit 04b75bc into main May 9, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants