Description
AirPLS (Adaptive Iteratively Reweighted Penalized Least Squares) and ArPLS (Asymmetrically Reweighted Penalized Least Squares) are powerful algorithms for removing complex non-linear baselines from spectral signals. However, their computational cost can be significant, especially when processing large numbers of spectra. Currently, we use the csc_matrix representation from scipy.sparse to optimize performance, but further improvements are needed.
Improving Attempts
To improve the performance, I have tried just-in-time compilation of some key functions using numba. However, numba does not support the csc_matrix type, and I cannot JIT compile the code. To overcome this issue, I thought of looking for a numba compatible representation of sparse matrices, but could not find one. Therefore, I have created my own, together with some functions to make basic algebra operations with them code to Gist. Unfortunately, this did not improve the performance over the current implementation.
Hacktoberfest Challenge
We invite open source developers to contribute to our project during Hacktoberfest. The goal is to improve the performance of both algorithms
Here are some ideas to work on:
- Find a more efficient way to JIT compile the code using tools like
numba.
- Investigate parallel or distributed computing techniques to speed up the processing of multiple spectra.
How to Contribute
Here is the contributing guidelines
Contact
We can have the the conversation in the Issue or the Discussion
Resources
Here are some relevant resources and references for understanding the theory and implementation of the AirPLS and ArPLS algorithms:
- Paper on AirPLS: Z.-M. Zhang, S. Chen, and Y.-Z. Liang, Baseline correction using adaptive iteratively reweighted penalized least squares. Analyst 135 (5), 1138-1146 (2010).
- Paper on ArPLS: Sung-June Baek, Aaron Park, Young-Jin Ahn, Jaebum Choo Baseline correction using asymmetrically reweighted penalized least squares smoothing
Description
AirPLS (Adaptive Iteratively Reweighted Penalized Least Squares) and ArPLS (Asymmetrically Reweighted Penalized Least Squares) are powerful algorithms for removing complex non-linear baselines from spectral signals. However, their computational cost can be significant, especially when processing large numbers of spectra. Currently, we use the
csc_matrixrepresentation fromscipy.sparseto optimize performance, but further improvements are needed.Improving Attempts
To improve the performance, I have tried just-in-time compilation of some key functions using
numba. However,numbadoes not support thecsc_matrixtype, and I cannot JIT compile the code. To overcome this issue, I thought of looking for anumbacompatible representation of sparse matrices, but could not find one. Therefore, I have created my own, together with some functions to make basic algebra operations with them code to Gist. Unfortunately, this did not improve the performance over the current implementation.Hacktoberfest Challenge
We invite open source developers to contribute to our project during Hacktoberfest. The goal is to improve the performance of both algorithms
Here are some ideas to work on:
numba.How to Contribute
Here is the contributing guidelines
Contact
We can have the the conversation in the Issue or the Discussion
Resources
Here are some relevant resources and references for understanding the theory and implementation of the AirPLS and ArPLS algorithms: