-
Notifications
You must be signed in to change notification settings - Fork 5
Add Triton installation for Windows CUDA, Linux ROCm/XPU #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Triton installation for Windows CUDA, Linux ROCm/XPU #31
Conversation
|
Hi @iwr-redmond, since you mentioned ROCm support in Issue #5, could you help test if this installation logic works on your AMD setup? |
|
Thanks @godnight10061 ! You can also ask on the You may also want to provide a simple script file that they can run to test |
|
I'll give it a try with Windows 11 + WSL2 (Ubuntu) soon |
My RTX 4070 is as useful for testing ROCm as the second buggy in a one-horse town. |
|
Hi @cmdr2, this PR is ready for your review. I have implemented the automatic Triton installation logic for Windows (CUDA), Linux (ROCm), and Linux (XPU). To make verification easier for the community, I also added a built-in self-test command in the latest commit: python -m torchruntime test compile. This will automatically verify if Triton is correctly installed and functional with torch.compile. I have successfully smoke-tested this on Windows with an RTX 3060 Ti. since I lack AMD and Intel hardware, I’ve also reached out on Discord to find testers for the ROCm and XPU paths. |
|
Quick update: A community member just verified this on Linux Mint 22.1 (Python 3.9). The test compile command passed successfully on their system, confirming that the ROCm-based Triton installation logic works as expected on Debian-based Linux distributions. |
Summary
cu*/nightly/cu*): installstriton-windowspytorch-triton-rocm(fromhttps://download.pytorch.org/whl)pytorch-triton-xpu(fromhttps://download.pytorch.org/whl)packaging(required bytorchruntime.platform_detection).Refs: #5
Why
torch.compile(and many third-party kernels) require Triton. On some platforms Torch bundles it (e.g. Linux CUDA), but on others users end up without Triton even after installing a GPU build of Torch.Implementation
torchruntime/installer.pyappends an extra pip install command for the platform-specific Triton package.Testing
python -m pytest -q.torch/torchvision/torchaudiofromhttps://download.pytorch.org/whl/cu128import tritonfailspython -m torchruntime install-> installstriton-windowstorch.compileCUDA smoke test -> OKRequest For Testing (hardware help wanted)
If you have one of these setups, please try:
python -m torchruntime installthen verifyimport tritonand run a smalltorch.compiletest.Also welcome: Windows CUDA users on different GPUs/Python versions.