Skip to content

fix #396: reduce spatial_alignment RAM usage with sparse neigh_graph and fix GPU validation#438

Draft
cursor[bot] wants to merge 1 commit intomainfrom
cursorautomated-issue-handler-7205
Draft

fix #396: reduce spatial_alignment RAM usage with sparse neigh_graph and fix GPU validation#438
cursor[bot] wants to merge 1 commit intomainfrom
cursorautomated-issue-handler-7205

Conversation

@cursor
Copy link
Copy Markdown

@cursor cursor Bot commented Mar 30, 2026

Summary

The spatial_alignment algorithm allocates a dense (num_nodes, num_nodes) tensor for the neighbor adjacency graph, which for large Stereo-seq samples at bin50 (~200K bins) consumes ~150 GB of RAM per sample, making the function infeasible even on 500+ GB workstations. Additionally, a bug in GPU device validation rejects odd-numbered GPU IDs (1, 3, etc.) due to an erroneous float(gpu) % 2 == 0 check.

Changes

  • stereo/algorithm/spatial_alignment/utils/dataset.py: Replace dense torch.zeros((N, N)) neighbor graph with sparse COO tensor, reducing memory from O(N²) to O(k×N) where k is the number of KNN neighbors (typically 15).
  • stereo/algorithm/spatial_alignment/module/dgi.py: Update readout() to use torch.sparse.mm() and torch.sparse.sum() when the neighbor mask is sparse, with backward-compatible fallback for dense tensors.
  • stereo/algorithm/spatial_alignment/spatialign/model.py: Fix GPU device validation to accept any valid non-negative GPU ID within torch.cuda.device_count(), instead of the broken modulo-2 check. Also fix a missing parentheses bug in get_format_time call.

Verification

  • AST syntax check passed for all 3 files
  • Import check passed (module-level, dependencies not installed in CI)

Classification

  • Type: bug
  • Confidence: high
  • Severity: high

Closes #396

Open in Web View Automation 

…and fix GPU validation

- Convert dense N×N neigh_graph tensor to sparse COO format in dataset.py,
  reducing memory from O(N²) to O(k*N) where k is the number of neighbors
- Update readout() in dgi.py to handle sparse neigh_mask via torch.sparse.mm
- Fix GPU device validation in model.py that incorrectly rejected odd GPU IDs
  (e.g., gpu=1) due to erroneous modulo-2 check

Co-authored-by: wanruiwen-genomics-cn <wanruiwen-genomics-cn@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

High RAM usage and lack of multi-GPU support in spatial_alignment

1 participant