Skip to content

Define multi-signature comparison API and heatmap support #175

@Cvicnaire

Description

@Cvicnaire

Summary

Implement first-class matrix-based signature comparison utilities in SigRepo.

The new API should support:

  • all-vs-all comparison within one list of signatures
  • all-vs-all comparison between two lists of signatures
  • multiple comparison metrics
  • heatmap visualization of the resulting similarity matrices

Background

There are already pairwise utilities in R/compareSignature.R:

  • compareSignatureFeatures()
  • compareSignatureScores()
  • signatureSetKsTest()
  • plotSignatureOverlap()

These are useful building blocks, but they do not implement the matrix-style API from the original task.

The original task also referenced MLscripts/R/signature_overlap.R as a possible starting point for the all-vs-all overlap workflow, but that file is not present in this repo.

Required API

Support two call shapes, analogous to cor():

compareSignatures(sig_list, ...)
  • compare all signatures in sig_list against each other
  • return an n x n matrix
compareSignatures(sig_list1, sig_list2, ...)
  • compare every signature in sig_list1 with every signature in sig_list2
  • return an n1 x n2 matrix

Required Metrics

Implement:

  1. jaccard
  2. fisher
  3. ks

Jaccard

  • operate on significant feature membership from omic_signature$signature
  • return Jaccard similarity
  • must be symmetric

Fisher

  • operate on significant feature overlap from omic_signature$signature
  • return Fisher exact test p-value by default
  • must be symmetric

Optional extension:

  • add a transform such as -log10(p) later

KS

  • directional metric
  • similarity(sig1, sig2) uses ranked scores from sig1$difexp and feature set from sig2$signature
  • similarity(sig2, sig1) reverses those roles
  • must not be forced symmetric

Preferred diagonal behavior for KS:

  • NA

Visualization Requirements

Add heatmap visualization of the similarity matrix or matrices:

  • one-list mode: rows and columns indexed by the same signature list
  • two-list mode: rows = sig_list1, columns = sig_list2
  • preserve directionality for KS
  • support legend, title override, color palette override, and clean NA handling

Input Expectations

First pass should support lists of OmicSignature objects directly.

Nice-to-have later:

  • signature IDs
  • signature names
  • lookup through conn_handler

Output Expectations

Preferred first pass:

  • return a numeric matrix
  • attach metadata via attributes if needed

Alternative acceptable shape:

  • return list(similarity = <matrix>, metadata = <list>)

The choice must be consistent across metrics.

Validation Requirements

Fail clearly when:

  • input is not a list
  • list elements are not OmicSignature objects
  • a required signature table is missing or empty
  • KS is requested but difexp is missing or empty
  • required columns such as feature_name or score are missing
  • input list lengths are invalid for the selected mode

Testing Requirements

At minimum, add tests for:

  1. one-list Jaccard returns a symmetric square matrix
  2. two-list Jaccard returns a rectangular matrix with correct dimensions
  3. Fisher output is symmetric
  4. KS output is directional and not symmetric
  5. KS fails cleanly when difexp is missing
  6. signature labels are applied correctly
  7. diagonal behavior is correct
  8. heatmap helper accepts produced matrices without error

Suggested Implementation Notes

  • reuse existing helpers in R/compareSignature.R where practical
  • pre-extract feature sets and ranked score tables once per signature to avoid repeated work
  • do not repeatedly fetch signatures inside the innermost comparison loop if objects are already provided

Additional Requirements

A fuller implementation should also make the following explicit:

  • Jaccard and Fisher must return symmetric matrices in one-list mode
  • KS should return a directional square matrix in one-list mode and does not need to equal its transpose
  • row and column labels should be deterministic, preferably using metadata$signature_name
  • default feature column should be feature_name
  • default score column for KS should be score
  • heatmaps should preserve row/column orientation clearly for directional metrics

Acceptance Criteria

The task is complete when:

  • both one-list and two-list modes work
  • Jaccard, Fisher, and KS metrics are implemented
  • symmetry behavior matches the metric definitions
  • heatmap visualization is available
  • tests cover directional KS behavior and symmetric Jaccard/Fisher behavior
  • the documentation is clear enough for a new contributor to implement and maintain the feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions