Fix native Jetson Orin (sm_87) build: respect explicit CUDA arch + propagate CuTe DSL --wrap/shim to static-lib consumers#118
Open
suharvest wants to merge 2 commits into
Conversation
The default CMAKE_CUDA_ARCHITECTURES (80;86;89[;100a;120]) was applied whenever AARCH64_BUILD was undefined, clobbering an explicit architecture passed on the command line. This broke native Jetson Orin (sm_87) builds configured with -DCMAKE_CUDA_ARCHITECTURES=87. enable_language(CUDA) (CMP0104 NEW, the default at our cmake_minimum 3.20) initializes CMAKE_CUDA_ARCHITECTURES to a compiler default, so the variable is already set after project() even when the user passed nothing. Record whether the user supplied an explicit value before project() runs, and only apply the project default when they did not. Regular x86 builds are unchanged; explicit selections (including an empty value) are preserved. Signed-off-by: suharvest <suharvest@gmail.com>
cute_dsl_setup() linked the cutedsl cudart shim and the CUDA<12.8 -Wl,--wrap=_cudaLaunchKernelEx option PRIVATEly. For STATIC_LIBRARY link targets a static archive performs no link step, so neither reached the final executable, breaking the CuTe DSL kernels on JetPack 6 / CUDA 12.6. Propagate the shim via PUBLIC and the --wrap link option via INTERFACE for static-library targets so the consuming executable inherits them; keep PRIVATE for shared/other targets. Signed-off-by: suharvest <suharvest@gmail.com>
|
+1, this fixes jetson AGX orin build for me (jp 6.2.1) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Type of change: Bug fix
Overview: Fixes two build-system issues that block a native Jetson Orin (sm_87, JetPack 6 / CUDA 12.6) build (closes #117):
CMakeLists.txt— respect an explicit-DCMAKE_CUDA_ARCHITECTURES.The project default (
80;86;89[;100a;120]) was applied wheneverAARCH64_BUILDwas undefined, overriding an explicit value on a native Orin build. Becauseenable_language(CUDA)(CMP0104 NEW, the default at ourcmake_minimum 3.20) already initializesCMAKE_CUDA_ARCHITECTURESto a compiler default, the user's intent is now captured beforeproject()runs, and the project default is applied only when the user did not pass one. Regular x86 builds are unchanged; explicit selections are preserved.cmake/CuteDsl.cmake— propagate the cudart shim and--wrapto static-lib consumers.cute_dsl_setup()attached the cutedsl cudart shim and the CUDA<12.8-Wl,--wrap=_cudaLaunchKernelExoption withPRIVATE. A static archive performs no link step, so forSTATIC_LIBRARYlink targets neither reached the consuming executable, breaking the CuTe DSL kernels on JetPack 6 / CUDA 12.6. The shim is now propagated viaPUBLICand the--wrapoption viaINTERFACEfor static-library targets;PRIVATEis retained for shared/other targets.Usage
Build-system fix; no API change. Native Jetson Orin build:
🚀 Pull Request Checklist
✅ Pre-commit Checks
cmake-format(v0.6.13) andcodespellrun clean on the changed files. (Fullpre-commitsuite not run locally; the change is limited to two CMake files.)🧪 Tests
-DCMAKE_CUDA_ARCHITECTURES=87, and the CuTe DSL static-lib consumer links under CUDA 12.6.📄 Documentation
jetson-orinis already a documentedEMBEDDED_TARGET).⚙️ Compatibility
Additional Information
Closes #117.