This is a scratch Julia implementation of the DESeq2 algorithm for differential expression analysis from RNAseq datasets. More generally, DESeq2 provides robust estimates of dispersion for Negative Binomial GLMs when there is a dispersion dependence on mean observed counts. This enables NB GLMs to be fit to noisy datasets where few replicates are available.
This repository is a work in progress
- Significant speedup on large datasets using Julia's native compilation to LLVM and threading capabilities.
- Direct from command line execution, including design formula.
- Translate tximport and apeglm to Julia for start-to-finish analysis
- Minimize dependencies
- Hard coded gradients
- Simulation of transcriptome count datasets following an arbitrary regression formula for testing.
- PowerLaw distribution imeplementation
- Returns count matrix, sampled coefficients, and other parameters
- Essentially simulates the generative process DESeq2 is estimating with its model
- Native NegativeBinomial2 implementation
- NB distribution formulated in terms of mean and dispersion
- sampling done via gamma-poisson distribution
- Normalization and scaling factors
- Implemented effective length normalization and sequencing depth normalization
