Skip to content

Fix native Jetson Orin (sm_87) build: respect explicit CUDA arch + propagate CuTe DSL --wrap/shim to static-lib consumers#118

Open
suharvest wants to merge 2 commits into
NVIDIA:mainfrom
suharvest:pr/jetson-build-compat
Open

Fix native Jetson Orin (sm_87) build: respect explicit CUDA arch + propagate CuTe DSL --wrap/shim to static-lib consumers#118
suharvest wants to merge 2 commits into
NVIDIA:mainfrom
suharvest:pr/jetson-build-compat

Conversation

@suharvest

Copy link
Copy Markdown

What does this PR do?

Type of change: Bug fix

Overview: Fixes two build-system issues that block a native Jetson Orin (sm_87, JetPack 6 / CUDA 12.6) build (closes #117):

  1. CMakeLists.txt — respect an explicit -DCMAKE_CUDA_ARCHITECTURES.
    The project default (80;86;89[;100a;120]) was applied whenever AARCH64_BUILD was undefined, overriding an explicit value on a native Orin build. Because enable_language(CUDA) (CMP0104 NEW, the default at our cmake_minimum 3.20) already initializes CMAKE_CUDA_ARCHITECTURES to a compiler default, the user's intent is now captured before project() runs, and the project default is applied only when the user did not pass one. Regular x86 builds are unchanged; explicit selections are preserved.

  2. cmake/CuteDsl.cmake — propagate the cudart shim and --wrap to static-lib consumers.
    cute_dsl_setup() attached the cutedsl cudart shim and the CUDA<12.8 -Wl,--wrap=_cudaLaunchKernelEx option with PRIVATE. A static archive performs no link step, so for STATIC_LIBRARY link targets neither reached the consuming executable, breaking the CuTe DSL kernels on JetPack 6 / CUDA 12.6. The shim is now propagated via PUBLIC and the --wrap option via INTERFACE for static-library targets; PRIVATE is retained for shared/other targets.

Usage

Build-system fix; no API change. Native Jetson Orin build:

cmake -B build -DCMAKE_BUILD_TYPE=Release \
  -DEMBEDDED_TARGET=jetson-orin \
  -DCMAKE_CUDA_ARCHITECTURES=87 \
  -DTRT_PACKAGE_DIR=/path/to/TensorRT
cmake --build build -j

🚀 Pull Request Checklist

✅ Pre-commit Checks

  • cmake-format (v0.6.13) and codespell run clean on the changed files. (Full pre-commit suite not run locally; the change is limited to two CMake files.)

🧪 Tests

  • No automated test covers the cross/Jetson build configuration. Verified manually: a native Orin (sm_87) configure now honors -DCMAKE_CUDA_ARCHITECTURES=87, and the CuTe DSL static-lib consumer links under CUDA 12.6.

📄 Documentation

  • No documentation change required (jetson-orin is already a documented EMBEDDED_TARGET).

⚙️ Compatibility

  • Backward compatible — x86/datacenter default builds and the aarch64 cross toolchain are unaffected.

Additional Information

Closes #117.

The default CMAKE_CUDA_ARCHITECTURES (80;86;89[;100a;120]) was applied whenever
AARCH64_BUILD was undefined, clobbering an explicit architecture passed on the
command line. This broke native Jetson Orin (sm_87) builds configured with
-DCMAKE_CUDA_ARCHITECTURES=87.

enable_language(CUDA) (CMP0104 NEW, the default at our cmake_minimum 3.20)
initializes CMAKE_CUDA_ARCHITECTURES to a compiler default, so the variable is
already set after project() even when the user passed nothing. Record whether
the user supplied an explicit value before project() runs, and only apply the
project default when they did not. Regular x86 builds are unchanged; explicit
selections (including an empty value) are preserved.

Signed-off-by: suharvest <suharvest@gmail.com>
cute_dsl_setup() linked the cutedsl cudart shim and the CUDA<12.8
-Wl,--wrap=_cudaLaunchKernelEx option PRIVATEly. For STATIC_LIBRARY link
targets a static archive performs no link step, so neither reached the final
executable, breaking the CuTe DSL kernels on JetPack 6 / CUDA 12.6.

Propagate the shim via PUBLIC and the --wrap link option via INTERFACE for
static-library targets so the consuming executable inherits them; keep PRIVATE
for shared/other targets.

Signed-off-by: suharvest <suharvest@gmail.com>
@suharvest suharvest requested a review from a team June 28, 2026 23:45
@rickwalder

rickwalder commented Jul 1, 2026

Copy link
Copy Markdown

+1, this fixes jetson AGX orin build for me (jp 6.2.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Native Jetson Orin (sm_87) build: explicit -DCMAKE_CUDA_ARCHITECTURES ignored + CuTe DSL --wrap/cudart shim not propagated to static-lib consumers

2 participants