Conversation
ilan-gold
left a comment
There was a problem hiding this comment.
Not sure what the unfinished tasks' status is, but the PR as it stands looks good. I would say unit-tests are a must but that's up to David at the end of the day.
| if isinstance(genes_log_gmean, dask.array.core.Array): | ||
| genes_log_gmean = genes_log_gmean.compute() |
There was a problem hiding this comment.
because I had trouble with this downstream. It was easier to transform to a dense numpy array before. I will check again to see if I can delay this further.
| genes_log_gmean = genes_log_gmean.compute() | ||
|
|
||
| # specify which kind of regularization is performed | ||
| scale_param = np.exp(self.model_container.theta_scale[0]) # TODO check if this is always correct |
There was a problem hiding this comment.
why an exponentiation here? is this meant to be link-function specific?
There was a problem hiding this comment.
I'm not sure how to resolve this (thus the TODO). Indeed, I think this depends on the noise model. I will have to look at that again.
|
Hi, "implement edgeR approach" this is actually a very important use-case for batchGLM. We are planning to use batchGLM/diffxpy for a pure Python implementation of MILO and we promised to be able to 1:1 replace eventually. I would kindly ask you to also strongly consider implementing this. Having the edgeR and DEseq2 approaches being implemented here will also greatly boost the impact. I have no doubt about this. |
Most definitely. I had a look at the edgeR source code for already. It shouldn't be too complicated to transfer this to batchGLM. I will start implementing this tomorrow but cannot give an estimate for the time it'll take at this point in time. I think the main part would be to take over |
|
Amazing @picciama
Would it be too crazy to just compare for example the DE results of the edgeR reimplementation and the new Python version for a small dataset/simulation? I know that the edgeR model does a lot, but this might be the eventual goal? Thank you! |
This branch contains the dispersion-smoothing functionality:
Optional TODOs: