Assigning chromatin status to predefined genomic regions from epigenomic profiling data
ChromCall is an R package for region-based chromatin enrichment analysis of epigenomic profiling data, including ChIP-seq, CUT&RUN, CUT&Tag, and ATAC-seq. It provides a transparent, statistically principled framework to quantify enrichment at predefined genomic regions (e.g. promoters or enhancers), enabling region-matched comparisons across samples and experiments without relying on data-dependent peak boundaries.
-
Region-centric analysis
Quantifies chromatin enrichment directly within predefined genomic windows, enabling consistent, region-matched comparisons across samples and experiments. -
Transparent statistical framework
Employs a Negative Binomial–based background model incorporating:- experiment-specific genome-wide background estimation
- region-specific, control-derived modulation factors (when available)
-
Control-aware enrichment testing (optional)
Supports integration of matched control experiments to account for local background variation. In the absence of a control, ChromCall relies on genome-wide background estimation. -
Multiple complementary metrics
For each region and experiment, ChromCall reports:- FDR-adjusted p-values
- enrichment score (log₂ observed / expected)
- z-score
- significance-based region classification
-
Multi-experiment and multi-sample support
Supports joint analysis of multiple chromatin marks and pairwise comparisons between samples within a unified framework. -
Optional expression integration
Gene-level expression values mapped to regions using TSS-based annotation can be incorporated to enable integrated chromatin–transcription analyses.
ChromCall models read counts using a Negative Binomial (NB) distribution, allowing for overdispersion beyond the Poisson assumption arising from technical variability in sequencing depth, library preparation, and local chromatin accessibility, thereby providing a more robust framework for sequencing-based count data.
For each experiment, a genome-wide background rate (
where (
Zero-count tiles are retained by default to avoid upward bias in sparse datasets and to ensure that (
To account for region-specific variability, ChromCall optionally derives a modulation factor from a matched control experiment:
The expected signal for region i in experiment j is then defined as:
In the absence of a control dataset,
ChromCall evaluates region-level enrichment using a one-sided Negative Binomial test:
where
When dispersion is negligible (i.e.
Multiple testing correction is applied across all regions using the Benjamini–Hochberg false discovery rate (FDR) procedure.
In addition to significance testing, ChromCall reports complementary effect-size metrics:
- Enrichment score
- z-score
Together, these metrics provide complementary measures of enrichment strength, effect size, and statistical confidence.
ChromCall is implemented in R and builds upon the Bioconductor ecosystem, ensuring interoperability with standard genomic data structures and downstream analysis workflows:
GRangesfor representing genomic intervalsSummarizedExperimentfor storing structured assay outputs and metadataGenomicAlignmentsfor importing aligned sequencing reads from BAM filesGenomeInfoDbandSeqinfofor genome annotation and consistency checks
Each processed sample is returned as a SummarizedExperiment object containing raw region-level read counts together with experiment-level background parameters (bg_mean, bg_size). After statistical testing, additional assays including lambda_t, p_value, p_adj, score, and z_nb are appended.
Pairwise sample comparisons generate region-level Δ enrichment and Δ z-score metrics, enabling direct comparative analysis of chromatin states across biological conditions.
ChromCall is available as a development version on GitHub and can be installed using remotes:
# install.packages("remotes")
remotes::install_github("GliomaGenomics/ChromCall")sample <- build_chromcall_sample(
sample_name = "sampleA",
experiments = list(
H3K27me3 = "h3k27me3.bam",
H3K4me3 = "h3k4me3.bam",
Control = "control.bam"
),
control_name = "Control",
genome_file = "genome.txt",
region_file = "promoters.bed",
window_size = 2000,
blacklist_file = "blacklist.bed",
expression_file = "expression_tss.bed"
)result <- test_region_counts(sample)comparison <- compare_samples(resultA, resultB, threshold = 0.25)write_experiment_results(result, "H3K4me3", "results.tsv")
write_comparison_results(comparison, "comparison.tsv")| Metric | Description |
|---|---|
counts |
Raw read count per region |
lambda_g |
Genome-wide background rate |
lambda_t |
Locally adjusted expected signal |
p_value, p_adj |
NB test p-values and FDR-adjusted values |
score |
log₂(Observed / Expected) enrichment |
z_nb |
NB-based z-score |
DeltaEnrichment, DeltaZscore |
Pairwise comparison metrics |
For questions, issues, or feature requests, please open a 👉 GitHub issue