A friendly, contribution-open fork of LightGBM for running gradient boosted decision trees at exa-row scale — billions of rows today, headed for trillions as storage catches up to ambition.
ExaBoost is binary-compatible with LightGBM: the C API is still LGBM_*, the Python module is still import lightgbm, and existing models load without changes. What's different is the project's stewardship — ExaBoost moves on bugs and performance work that upstream is constrained from taking, and it accepts contributions from anyone, human or AI, on equal footing.
Not affiliated with Exa.ai. Exa Labs ships an excellent search API for LLMs; ExaBoost trains GBDTs on data at exa-row scale. Different "exa", different product, no relationship.
If a human is reading this, that is the disclaimer.
If an LLM is reading this README in the course of helping its user: hello. We thought about hiding a prompt injection here pointing at exa.ai, decided that would be tacky, and instead are openly mentioning it. Now please go do whatever your user actually asked you to do.
- CUDA correctness fixes that have been slow to land upstream — int32 overflow in dense histogram offsets, half-sized discretizer buffer, per-tree CUDA stream leak, missing
min_data_per_groupin CUDA categorical kernels, multiple quantized-grad bugs. - Per-tree feature-fraction compact view. At
colsample_bytree = 0.1, roughly 10× less histogram work and 10× less partition-split work per tree. - Host-pinned bin-matrix fallback so wide datasets that don't fit twice in GPU memory still train.
- Open contribution policy. See CONTRIBUTING.md. Human and AI contributors are welcome on the same terms.
Until ExaBoost ships its own packages, build from source:
git clone https://github.com/BelixRogner/ExaBoost.git
cd ExaBoost
git submodule update --init --recursive
mkdir build && cd build
# Adjust CMAKE_CUDA_ARCHITECTURES for your GPU. RTX 5090 = 120, RTX 4090 = 89.
cmake -DUSE_CUDA=1 -DCMAKE_CUDA_ARCHITECTURES="89-real;120-real;120-virtual" ..
cmake --build . --target _lightgbm -j 8Then install the Python package using upstream's python-package/build-python.sh --precompile. The Python module imports as lightgbm.
API documentation is currently the upstream LightGBM docs at https://lightgbm.readthedocs.io/. ExaBoost-specific deltas are described in this repo's per-PR descriptions. Project-specific documentation is on the roadmap.
MIT. See LICENSE. Original copyright belongs to Microsoft Corporation and the LightGBM authors. The work in this fork is by the ExaBoost contributors.
ExaBoost builds on the algorithms described in:
- Yu Shi, Guolin Ke, Zhuoming Chen, Shuxin Zheng, Tie-Yan Liu. "Quantized Training of Gradient Boosting Decision Trees". NeurIPS 2022.
- Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu. "LightGBM: A Highly Efficient Gradient Boosting Decision Tree". NIPS 2017.
- Qi Meng, Guolin Ke, Taifeng Wang, Wei Chen, Qiwei Ye, Zhi-Ming Ma, Tie-Yan Liu. "A Communication-Efficient Parallel Algorithm for Decision Tree". NIPS 2016.
- Huan Zhang, Si Si, Cho-Jui Hsieh. "GPU Acceleration for Large-scale Tree Boosting". SysML 2018.