Skip to content

Split GSVA ranks calculations, enabling distribution of computations#267

Merged
rcastelo merged 12 commits into
develfrom
266-split-gsva-ranks-calculations
May 25, 2026
Merged

Split GSVA ranks calculations, enabling distribution of computations#267
rcastelo merged 12 commits into
develfrom
266-split-gsva-ranks-calculations

Conversation

@rcastelo

Copy link
Copy Markdown
Owner

Split the calculation of GSVA ranks into two steps, one for row normalization and another for column rank calculation. The functions gsvaRanks() and gsvaScores() have been deprecated and replaced by gsvaRowNorm(), gsvaColRanks() and gsvaColScores(). Each of them takes two additional parameters called first and last, which allow one to restrict calculations to the range of rows (for gsvaRowNorm()) or columns (for gsvaColRanks() and gsvaColScores()) that go from first to last. By default, first=NA and last=NA to indicate no such restrictions apply. Being able to restrict calculations to a range of rows or columns, should enable distributing computations in an HPC environment across nodes with non-shared main memory. This PR closes issue #266.

Copilot AI review requested due to automatic review settings May 25, 2026 18:19
@rcastelo rcastelo linked an issue May 25, 2026 that may be closed by this pull request

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Splits the GSVA rank calculation into separate gsvaRowNorm(), gsvaColRanks(), and gsvaColScores() methods (deprecating gsvaRanks()/gsvaScores()), each accepting first/last arguments so the row-normalization and column-rank/score stages can be distributed across HPC nodes. Supporting changes refactor the wrapData() generic and data containers to carry GSVA metadata and added assays (e.g., gsvarownr, gsvaranks), expand .check_maxmem()/.check_ondisk()/.check_sparse_load_input_expr() to honor restricted ranges, update HDF5 serialization to round-trip the new metadata, and add a details() method while reworking show(). Documentation, vignettes and unit tests are adjusted accordingly.

Changes:

  • Introduce three-step API (gsvaRowNorm/gsvaColRanks/gsvaColScores) with first/last chunking and dropExistingAssays support; mark gsvaRanks/gsvaScores deprecated.
  • Refactor wrapData() methods, helper checks, and .pull_param()/.gsvaParam_as_list() to embed/retrieve parameter metadata across containers and to size memory based on restricted ranges.
  • Update vignettes, Rd files, tests, and HDF5 serialization helpers to use the new API and to add details() method.

Reviewed changes

Copilot reviewed 29 out of 38 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
R/gsva.R Adds new step methods, helpers .pull_param/.gsvaParam_as_list, details(), and chunked first/last plumbing.
R/utils.R Generalizes wrapData methods, adds range/first-last and whdim plumbing in .check_* helpers.
R/gsvaRanks_serialization.R Reworks save/load to operate on GsvaExprData with embedded GSVA metadata.
R/GSVA-pkg-deprecated.R Re-implements deprecated gsvaRanks/gsvaScores via new helpers.
R/ssgsea.R, R/plage.R, R/zscore.R Updates gsva() methods to call the new .check_* and wrapData() signatures; adds details for ssgsea.
R/GsvaMethodParam.R, R/AllGenerics.R, R/AllClasses.R Adds new generics/exports, simplified show(), details() method, updates supported classes.
R/GSVA-package.R, NAMESPACE Adjusts imports/exports for new methods and dropped re-exports.
man/*.Rd Regenerated docs for renamed/new methods and link fixes.
vignettes/GSVA_scRNAseq.Rmd, GSVA_proteomics.Rmd, GSVA.bib Documents three-step API, QC workflow updates, new reference.
inst/unitTests/* Updates tests to new API and adds chunked-range checks.
DESCRIPTION, .covrignore Roxygen config bump and coverage exclusions for deprecated/defunct files.
Files not reviewed (9)
  • man/GSVA-pkg-deprecated.Rd: Language not supported
  • man/GsvaExprData-class.Rd: Language not supported
  • man/GsvaMethodParam-class.Rd: Language not supported
  • man/geneIdsToGeneSetCollection.Rd: Language not supported
  • man/geneSets.Rd: Language not supported
  • man/gsva.Rd: Language not supported
  • man/gsvaAnnotation.Rd: Language not supported
  • man/gsvaEnrichment.Rd: Language not supported
  • man/gsvaParam-class.Rd: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/gsva.R Outdated
Comment thread R/gsva.R Outdated
Comment thread R/gsvaRanks_serialization.R Outdated
Comment thread vignettes/GSVA_scRNAseq.Rmd Outdated
Comment thread vignettes/GSVA_scRNAseq.Rmd
Comment thread R/gsva.R Outdated
rcastelo and others added 4 commits May 25, 2026 21:00
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@rcastelo rcastelo merged commit a7730d0 into devel May 25, 2026
2 checks passed
@rcastelo rcastelo deleted the 266-split-gsva-ranks-calculations branch May 25, 2026 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Split GSVA ranks calculations

2 participants