Summary
Implement first-class matrix-based signature comparison utilities in SigRepo.
The new API should support:
- all-vs-all comparison within one list of signatures
- all-vs-all comparison between two lists of signatures
- multiple comparison metrics
- heatmap visualization of the resulting similarity matrices
Background
There are already pairwise utilities in R/compareSignature.R:
compareSignatureFeatures()
compareSignatureScores()
signatureSetKsTest()
plotSignatureOverlap()
These are useful building blocks, but they do not implement the matrix-style API from the original task.
The original task also referenced MLscripts/R/signature_overlap.R as a possible starting point for the all-vs-all overlap workflow, but that file is not present in this repo.
Required API
Support two call shapes, analogous to cor():
compareSignatures(sig_list, ...)
- compare all signatures in
sig_list against each other
- return an
n x n matrix
compareSignatures(sig_list1, sig_list2, ...)
- compare every signature in
sig_list1 with every signature in sig_list2
- return an
n1 x n2 matrix
Required Metrics
Implement:
jaccard
fisher
ks
Jaccard
- operate on significant feature membership from
omic_signature$signature
- return Jaccard similarity
- must be symmetric
Fisher
- operate on significant feature overlap from
omic_signature$signature
- return Fisher exact test p-value by default
- must be symmetric
Optional extension:
- add a transform such as
-log10(p) later
KS
- directional metric
- similarity(
sig1, sig2) uses ranked scores from sig1$difexp and feature set from sig2$signature
- similarity(
sig2, sig1) reverses those roles
- must not be forced symmetric
Preferred diagonal behavior for KS:
Visualization Requirements
Add heatmap visualization of the similarity matrix or matrices:
- one-list mode: rows and columns indexed by the same signature list
- two-list mode: rows =
sig_list1, columns = sig_list2
- preserve directionality for KS
- support legend, title override, color palette override, and clean
NA handling
Input Expectations
First pass should support lists of OmicSignature objects directly.
Nice-to-have later:
- signature IDs
- signature names
- lookup through
conn_handler
Output Expectations
Preferred first pass:
- return a numeric matrix
- attach metadata via attributes if needed
Alternative acceptable shape:
- return
list(similarity = <matrix>, metadata = <list>)
The choice must be consistent across metrics.
Validation Requirements
Fail clearly when:
- input is not a list
- list elements are not
OmicSignature objects
- a required
signature table is missing or empty
- KS is requested but
difexp is missing or empty
- required columns such as
feature_name or score are missing
- input list lengths are invalid for the selected mode
Testing Requirements
At minimum, add tests for:
- one-list Jaccard returns a symmetric square matrix
- two-list Jaccard returns a rectangular matrix with correct dimensions
- Fisher output is symmetric
- KS output is directional and not symmetric
- KS fails cleanly when
difexp is missing
- signature labels are applied correctly
- diagonal behavior is correct
- heatmap helper accepts produced matrices without error
Suggested Implementation Notes
- reuse existing helpers in
R/compareSignature.R where practical
- pre-extract feature sets and ranked score tables once per signature to avoid repeated work
- do not repeatedly fetch signatures inside the innermost comparison loop if objects are already provided
Additional Requirements
A fuller implementation should also make the following explicit:
- Jaccard and Fisher must return symmetric matrices in one-list mode
- KS should return a directional square matrix in one-list mode and does not need to equal its transpose
- row and column labels should be deterministic, preferably using
metadata$signature_name
- default feature column should be
feature_name
- default score column for KS should be
score
- heatmaps should preserve row/column orientation clearly for directional metrics
Acceptance Criteria
The task is complete when:
- both one-list and two-list modes work
- Jaccard, Fisher, and KS metrics are implemented
- symmetry behavior matches the metric definitions
- heatmap visualization is available
- tests cover directional KS behavior and symmetric Jaccard/Fisher behavior
- the documentation is clear enough for a new contributor to implement and maintain the feature
Summary
Implement first-class matrix-based signature comparison utilities in
SigRepo.The new API should support:
Background
There are already pairwise utilities in
R/compareSignature.R:compareSignatureFeatures()compareSignatureScores()signatureSetKsTest()plotSignatureOverlap()These are useful building blocks, but they do not implement the matrix-style API from the original task.
The original task also referenced
MLscripts/R/signature_overlap.Ras a possible starting point for the all-vs-all overlap workflow, but that file is not present in this repo.Required API
Support two call shapes, analogous to
cor():sig_listagainst each othern x nmatrixsig_list1with every signature insig_list2n1 x n2matrixRequired Metrics
Implement:
jaccardfisherksJaccard
omic_signature$signatureFisher
omic_signature$signatureOptional extension:
-log10(p)laterKS
sig1,sig2) uses ranked scores fromsig1$difexpand feature set fromsig2$signaturesig2,sig1) reverses those rolesPreferred diagonal behavior for KS:
NAVisualization Requirements
Add heatmap visualization of the similarity matrix or matrices:
sig_list1, columns =sig_list2NAhandlingInput Expectations
First pass should support lists of
OmicSignatureobjects directly.Nice-to-have later:
conn_handlerOutput Expectations
Preferred first pass:
Alternative acceptable shape:
list(similarity = <matrix>, metadata = <list>)The choice must be consistent across metrics.
Validation Requirements
Fail clearly when:
OmicSignatureobjectssignaturetable is missing or emptydifexpis missing or emptyfeature_nameorscoreare missingTesting Requirements
At minimum, add tests for:
difexpis missingSuggested Implementation Notes
R/compareSignature.Rwhere practicalAdditional Requirements
A fuller implementation should also make the following explicit:
metadata$signature_namefeature_namescoreAcceptance Criteria
The task is complete when: