fix #396: reduce spatial_alignment RAM usage with sparse neigh_graph and fix GPU validation#438
Draft
cursor[bot] wants to merge 1 commit intomainfrom
Draft
fix #396: reduce spatial_alignment RAM usage with sparse neigh_graph and fix GPU validation#438cursor[bot] wants to merge 1 commit intomainfrom
cursor[bot] wants to merge 1 commit intomainfrom
Conversation
…and fix GPU validation - Convert dense N×N neigh_graph tensor to sparse COO format in dataset.py, reducing memory from O(N²) to O(k*N) where k is the number of neighbors - Update readout() in dgi.py to handle sparse neigh_mask via torch.sparse.mm - Fix GPU device validation in model.py that incorrectly rejected odd GPU IDs (e.g., gpu=1) due to erroneous modulo-2 check Co-authored-by: wanruiwen-genomics-cn <wanruiwen-genomics-cn@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
spatial_alignmentalgorithm allocates a dense(num_nodes, num_nodes)tensor for the neighbor adjacency graph, which for large Stereo-seq samples at bin50 (~200K bins) consumes ~150 GB of RAM per sample, making the function infeasible even on 500+ GB workstations. Additionally, a bug in GPU device validation rejects odd-numbered GPU IDs (1, 3, etc.) due to an erroneousfloat(gpu) % 2 == 0check.Changes
stereo/algorithm/spatial_alignment/utils/dataset.py: Replace densetorch.zeros((N, N))neighbor graph with sparse COO tensor, reducing memory from O(N²) to O(k×N) where k is the number of KNN neighbors (typically 15).stereo/algorithm/spatial_alignment/module/dgi.py: Updatereadout()to usetorch.sparse.mm()andtorch.sparse.sum()when the neighbor mask is sparse, with backward-compatible fallback for dense tensors.stereo/algorithm/spatial_alignment/spatialign/model.py: Fix GPU device validation to accept any valid non-negative GPU ID withintorch.cuda.device_count(), instead of the broken modulo-2 check. Also fix a missing parentheses bug inget_format_timecall.Verification
Classification
Closes #396