Skip to content

Backport tile fixes 34#9542

Merged
wmaxey merged 3 commits into
NVIDIA:branch/3.4.xfrom
miscco:backport_tile_fixes_34
Jun 23, 2026
Merged

Backport tile fixes 34#9542
wmaxey merged 3 commits into
NVIDIA:branch/3.4.xfrom
miscco:backport_tile_fixes_34

Conversation

@miscco

@miscco miscco commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

This backports #9487 and #9488

miscco added 2 commits June 22, 2026 12:47
* [Tile] Mark alignment helpers as unsupported in tile mode

Currently tile does not support `__builtin_assume_aligned`

Previously we would just disable the whole codepath but with added support for `__builtin_is_constant_evaluated` it became active again.

Rather than disabling it for all of CCCL with tile mode, we mark those functions that use the builtin as `_CCCL_HOST_DEVICE`

* [Tile] Fix various tests

* [Tile] Do not compile out device only function

it is otherwise not visible
@miscco miscco requested a review from a team as a code owner June 22, 2026 10:54
@miscco miscco requested a review from wmaxey June 22, 2026 10:54
@github-project-automation github-project-automation Bot moved this to Todo in CCCL Jun 22, 2026
@cccl-authenticator-app cccl-authenticator-app Bot moved this from Todo to In Review in CCCL Jun 22, 2026
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Memory alignment utilities and related helpers (e.g., align, assume-aligned, pointer rebind) are now available for both host and device/tile execution.
  • Chores

    • Improved tile compilation detection by adding stricter compiler/version gating.
    • Expanded tile-mode test coverage and adjusted device/tile annotations accordingly.
    • Updated tests to document known tile codegen issues and expected failures (nvbug6327166).

Walkthrough

_CCCL_TILE_COMPILATION() is redefined to require both __CUDACC_TILE__ and NVCC version >13.3. Memory and alignment functions (align_down, align_up, ptr_rebind, aligned_accessor, align, assume_aligned, __runtime_assume_aligned) are re-annotated with _CCCL_HOST_DEVICE_API instead of _CCCL_API. Test infrastructure macros are updated to separate tile and device qualifiers, device-only utility tests are broadened with tile execution paths and new tile-only test files, mdspan device tests switch to TEST_TILE_DEVICE_FUNC, and tile codegen failures are annotated with nvbug6327166.

Changes

Tile Compilation Support for Memory, Alignment, and Test Infrastructure

Layer / File(s) Summary
Tile compilation macros and _CCCL_TILE_API definition
libcudacxx/include/cuda/std/__cccl/compiler.h, libcudacxx/include/cuda/std/__cccl/visibility.h
Removes unconditional _CCCL_TILE_COMPILATION() and redefines it to require __CUDACC_TILE__ and NVCC >13.3; otherwise evaluates to 0. Adds _CCCL_TILE_API macro for both NVHPC and non-NVHPC compiler branches with visibility/instantiation control.
Memory and alignment functions re-annotated with _CCCL_HOST_DEVICE_API
libcudacxx/include/cuda/__memory/align_down.h, libcudacxx/include/cuda/__memory/align_up.h, libcudacxx/include/cuda/__memory/ptr_rebind.h, libcudacxx/include/cuda/std/__mdspan/aligned_accessor.h, libcudacxx/include/cuda/std/__memory/align.h, libcudacxx/include/cuda/std/__memory/assume_aligned.h, libcudacxx/include/cuda/std/__memory/runtime_assume_aligned.h
Switches _CCCL_API to _CCCL_HOST_DEVICE_API on align_down, align_up, all four cv-qualified ptr_rebind overloads, aligned_accessor::access/offset, cuda::std::align, assume_aligned, and __runtime_assume_aligned; adds [[maybe_unused]] to __runtime_assume_aligned parameter.
Test macros and types separated for tile/device distinction
libcudacxx/test/support/test_macros.h, libcudacxx/test/support/host_device_types.h
Updates TEST_FUNC to _CCCL_HOST_DEVICE _CCCL_TILE, changes TEST_DEVICE_FUNC to _CCCL_DEVICE, adds TEST_TILE_FUNC and TEST_TILE_DEVICE_FUNC macros. Switches device_only_type member functions from TEST_DEVICE_FUNC to explicit __device__ annotations; adds tile_only_type with __tile__ annotations under tile compilation guard.
Utility tests broadened for tile execution and new tile-only test files
libcudacxx/test/libcudacxx/cuda/utilities/*/device_only_types.pass.cpp, libcudacxx/test/libcudacxx/cuda/utilities/*/tile_only_types.pass.cpp, libcudacxx/test/libcudacxx/cuda/functional/proclaim_return_type.pass.cpp
Wraps device-only test() in _CCCL_TILE_COMPILATION() || _CCCL_DEVICE_COMPILATION() guards and adds __global__ test_kernel() entry points; adds new tile-only test files for expected<void, T>, expected<T, T>, optional<T>, tuple<T>, unexpected<T>, pair<T, T>, variant<T>. Updates functional test macros to TEST_TILE_DEVICE_FUNC/TEST_TILE_FUNC().
mdspan device tests updated with TEST_TILE_DEVICE_FUNC and constraint refactoring
libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/device_mdspan/index_operator.pass.cpp, libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/*.pass.cpp
Updates all test_mdspan_types, mixin_*, test(), and test_evil() helpers from TEST_DEVICE_FUNC to TEST_TILE_DEVICE_FUNC. Refactors multi-argument operator[] constraint from enable_if_t type matching to requires expression validation for better SFINAE clarity.
Algorithm tests gated to exclude tile compilation
libcudacxx/test/libcudacxx/std/algorithms/alg.modifying/alg.copy/*.pass.cpp
Adds && !_CCCL_TILE_COMPILATION() guards to device-only copy test instantiations to prevent unsupported tile codegen paths.
Function object tests extended with tile-specific callable coverage
libcudacxx/test/libcudacxx/std/utilities/function.objects/*/*.pass.cpp
Updates comment documentation and adds conditional _CCCL_TILE_COMPILATION() test sections across arithmetic, bitwise, comparison, logical, and negation operator tests; each section defines a with_tile_op functor and __tile_global__ kernel to validate tile-qualified callable behavior.
Memory/alignment and miscellaneous tests marked unsupported or XFAIL for tile codegen ICE
libcudacxx/test/libcudacxx/cuda/memory/*.pass.cpp, libcudacxx/test/libcudacxx/std/utilities/memory/ptr.align/*.pass.cpp, libcudacxx/test/libcudacxx/std/containers/views/mdspan/mdspan.aligned_accessor/aligned_accessor.pass.cpp, libcudacxx/test/libcudacxx/std/atomics/.../16b_integral_ref.pass.cpp, libcudacxx/test/libcudacxx/libcxx/*, libcudacxx/test/libcudacxx/std/utilities/time/...max.pass.cpp
Replaces generic "asm unsupported" annotations with specific nvbug6327166 internal compiler error notes across memory alignment tests; adds UNSUPPORTED: enable-tile directives; tightens device-only guards with && !_CCCL_TILE_COMPILATION() where appropriate; switches assume_aligned.runfail.cpp to public <cuda/std/memory> header; removes obsolete TEST_DEVICE_FUNC compile guard in max.pass.cpp.

Possibly related PRs

  • NVIDIA/cccl#9487: Directly overlaps — changes the same alignment/memory APIs (align_down, align_up, ptr_rebind, assume_aligned, align) from _CCCL_API to _CCCL_HOST_DEVICE_API.
  • NVIDIA/cccl#9488: Changes _CCCL_TILE_COMPILATION() in the same compiler.h file to gate tile mode on NVCC version >13.3.

Suggested labels

backport branch/3.4.x

Suggested reviewers

  • fbusato
  • wmaxey

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
libcudacxx/test/libcudacxx/std/atomics/atomics.types.generic/integral/16b_integral_ref.pass.cpp (1)

13-15: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

suggestion: Use UNSUPPORTED: enable-tile and reference nvbug6327166 for consistency with the other tile-related test annotations in this layer (aligned_accessor.pass.cpp, assume_aligned.pass.cpp, align.pass.cpp). Those files document the specific ICE: "call to unknown tile builtin function!"

If this test hits a different error ("asm statement is unsupported"), clarify whether it's the same underlying bug or a distinct issue.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f98f4d3a-5a71-492d-a59b-1396923fc269

📥 Commits

Reviewing files that changed from the base of the PR and between 70434dd and ebf0233.

📒 Files selected for processing (29)
  • libcudacxx/include/cuda/__memory/align_down.h
  • libcudacxx/include/cuda/__memory/align_up.h
  • libcudacxx/include/cuda/__memory/ptr_rebind.h
  • libcudacxx/include/cuda/std/__cccl/compiler.h
  • libcudacxx/include/cuda/std/__mdspan/aligned_accessor.h
  • libcudacxx/include/cuda/std/__memory/align.h
  • libcudacxx/include/cuda/std/__memory/assume_aligned.h
  • libcudacxx/include/cuda/std/__memory/runtime_assume_aligned.h
  • libcudacxx/test/libcudacxx/cuda/functional/proclaim_return_type.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/memory/align_down.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/memory/align_up.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/memory/ptr_rebind.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/expected/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/expected/expected.void/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/optional/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/tuple/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/unexpected/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/utility/pair/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/variant/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/libcxx/macros/extended_data_types.fp8.fail.cpp
  • libcudacxx/test/libcudacxx/libcxx/numerics/floating.point/device_fp128_functions.pass.cpp
  • libcudacxx/test/libcudacxx/std/algorithms/alg.modifying/alg.copy/copy_backward.pass.cpp
  • libcudacxx/test/libcudacxx/std/algorithms/alg.modifying/alg.copy/copy_n.pass.cpp
  • libcudacxx/test/libcudacxx/std/algorithms/alg.modifying/alg.copy/copy_rand.pass.cpp
  • libcudacxx/test/libcudacxx/std/atomics/atomics.types.generic/integral/16b_integral_ref.pass.cpp
  • libcudacxx/test/libcudacxx/std/containers/views/mdspan/mdspan.aligned_accessor/aligned_accessor.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/memory/ptr.align/align.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/memory/ptr.align/assume_aligned.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/memory/ptr.align/assume_aligned.runfail.cpp

// UNSUPPORTED: enable-tile
// nvbug6327166: error: Internal Compiler Error (tile codegen): "call to unknown tile builtin function!"

#include <cuda/std/memory>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether the internal header is still functional in non-tile builds
# and whether <cuda/std/memory> transitively includes assume_aligned

rg -n "include.*assume_aligned" libcudacxx/include/cuda/std/memory

# Check if the internal header is still present and functional
fd -t f "assume_aligned.h" libcudacxx/include/cuda/std/__memory/

Repository: NVIDIA/cccl

Length of output: 220


important: The change from <cuda/std/__memory/assume_aligned.h> to <cuda/std/memory> violates the guideline requiring the most precise header available. The internal header still exists and is transitively included by the umbrella header, so using the precise internal header is preferred. Either revert to the internal header or provide technical justification for the broader header (e.g., if testing the public API stability is the intent).

Source: Coding guidelines

@github-actions

This comment has been minimized.

…VIDIA#9313)

We were accidentally always testing `__tile__ __device__` functions, which avoided some issues that have been fixed in the ToT compiler.

We need to ensure that we properly test `__device__` only functions in a `__device__ __tile__` feature

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
libcudacxx/test/libcudacxx/std/utilities/function.objects/func.invoke/invoke.pass.cpp (1)

401-404: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

suggestion: Align callable return values with the asserted expectation. Line 403 and Line 419 return bool{} (false), while Line 410 and Line 426 assert a truthy cuda::std::invoke(...) result. Returning true here avoids a latent runtime-failure path if these kernels are executed.

Also applies to: 410-410, 417-420, 426-426


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: df7fa037-d24d-49dc-939d-7ae53b22f68e

📥 Commits

Reviewing files that changed from the base of the PR and between ebf0233 and 990e7c8.

📒 Files selected for processing (55)
  • libcudacxx/include/cuda/std/__cccl/visibility.h
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/device_mdspan/index_operator.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/assign.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/conversion.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/ctor.copy.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/ctor.default.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/ctor.dh_array.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/ctor.dh_extents.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/ctor.dh_integers.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/ctor.dh_map.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/ctor.dh_map_acc.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/ctor.dh_span.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/ctor.move.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/index_operator.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/move.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/swap.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/expected/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/expected/expected.void/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/expected/expected.void/tile_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/expected/tile_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/optional/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/optional/tile_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/tuple/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/tuple/tile_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/unexpected/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/unexpected/tile_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/utility/pair/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/utility/pair/tile_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/variant/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/variant/tile_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/arithmetic.operations/divides.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/arithmetic.operations/minus.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/arithmetic.operations/modulus.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/arithmetic.operations/multiplies.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/arithmetic.operations/negate.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/arithmetic.operations/plus.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/bitwise.operations/bit_and.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/bitwise.operations/bit_not.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/bitwise.operations/bit_or.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/bitwise.operations/bit_xor.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/comparisons/equal_to.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/comparisons/greater.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/comparisons/greater_equal.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/comparisons/less.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/comparisons/less_equal.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/comparisons/not_equal_to.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/func.invoke/invoke.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/logical.operations/logical_and.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/logical.operations/logical_not.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/logical.operations/logical_or.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/negators/binary_negate.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/function.objects/negators/unary_negate.pass.cpp
  • libcudacxx/test/libcudacxx/std/utilities/time/time.traits/time.traits.duration_values/max.pass.cpp
  • libcudacxx/test/support/host_device_types.h
  • libcudacxx/test/support/test_macros.h
💤 Files with no reviewable changes (1)
  • libcudacxx/test/libcudacxx/std/utilities/time/time.traits/time.traits.duration_values/max.pass.cpp
✅ Files skipped from review due to trivial changes (1)
  • libcudacxx/test/libcudacxx/cuda/containers/views/mdspan/shared_mem_mdspan/swap.pass.cpp
🚧 Files skipped from review as they are similar to previous changes (4)
  • libcudacxx/test/libcudacxx/cuda/utilities/variant/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/tuple/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/unexpected/device_only_types.pass.cpp
  • libcudacxx/test/libcudacxx/cuda/utilities/optional/device_only_types.pass.cpp


// REQUIRES: enable-tile

// UNSUPPORTED: clang-14

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

important: Add a reason for the clang-14 skip.
Line 12 marks UNSUPPORTED: clang-14 but does not state the failure mode; add a one-line rationale like the other tile tests in this cohort so the skip remains auditable.
As per coding guidelines, “If a test is unsupported, expected to fail, disabled, or skipped on a platform, motivate it with a comment.”

Source: Coding guidelines

Comment on lines +12 to +14
#include <cuda/std/cassert>
#include <cuda/std/expected>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -n 'cuda::std::initializer_list' libcudacxx/test/libcudacxx/cuda/utilities/unexpected/tile_only_types.pass.cpp
rg -n '`#include` <cuda/std/initializer_list>' libcudacxx/test/libcudacxx/cuda/utilities/unexpected/tile_only_types.pass.cpp

Repository: NVIDIA/cccl

Length of output: 160


🏁 Script executed:

cat -n libcudacxx/test/libcudacxx/cuda/utilities/unexpected/tile_only_types.pass.cpp | head -50

Repository: NVIDIA/cccl

Length of output: 1965


🏁 Script executed:

# Check what headers are included by <cuda/std/expected>
grep -r "include.*initializer_list" libcudacxx/include/cuda/std/expected* 2>/dev/null | head -20

Repository: NVIDIA/cccl

Length of output: 37


🏁 Script executed:

# Check the expected header content for includes
head -100 libcudacxx/include/cuda/std/expected | grep -E "^`#include`"

Repository: NVIDIA/cccl

Length of output: 300


🏁 Script executed:

# Check if initializer_list is included in the unexpected.h chain
grep -r "initializer_list" libcudacxx/include/cuda/std/__expected/ 2>/dev/null | head -10

Repository: NVIDIA/cccl

Length of output: 1409


🏁 Script executed:

# Check for similar patterns in other test files - do they directly include initializer_list?
fd -e "pass.cpp" libcudacxx/test/ --exec grep -l "cuda::std::initializer_list" {} \; | head -5 | xargs -I {} sh -c 'echo "=== {} ===" && grep "`#include`.*initializer_list" {}'

Repository: NVIDIA/cccl

Length of output: 425


🏁 Script executed:

# Check for similar patterns in other test files - do they directly include initializer_list?
fd -e "pass.cpp" --full-path 'libcudacxx/test/' --exec grep -l "cuda::std::initializer_list" {} \; 2>/dev/null | head -5 | xargs -I {} sh -c 'echo "=== {} ===" && grep "`#include`.*initializer_list" {} 2>/dev/null'

Repository: NVIDIA/cccl

Length of output: 560


important: Add explicit #include <cuda/std/initializer_list> header.
Line 37 uses cuda::std::initializer_list<int>{}, but the file relies on transitive inclusion from <cuda/std/expected>. Per coding guidelines, files must include all headers for symbols they directly use. The sibling test file libcudacxx/test/libcudacxx/cuda/utilities/expected/host_only_types.pass.cpp includes this header explicitly, establishing the expected pattern.

Source: Coding guidelines

Comment on lines +12 to +14
#include <cuda/std/cassert>
#include <cuda/std/variant>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -n 'cuda::std::initializer_list' libcudacxx/test/libcudacxx/cuda/utilities/variant/tile_only_types.pass.cpp
rg -n '`#include` <cuda/std/initializer_list>' libcudacxx/test/libcudacxx/cuda/utilities/variant/tile_only_types.pass.cpp

Repository: NVIDIA/cccl

Length of output: 342


🏁 Script executed:

head -100 libcudacxx/test/libcudacxx/cuda/utilities/variant/tile_only_types.pass.cpp | tail -80

Repository: NVIDIA/cccl

Length of output: 2290


🏁 Script executed:

rg -n '`#include`' libcudacxx/test/libcudacxx/cuda/utilities/variant/tile_only_types.pass.cpp | head -20

Repository: NVIDIA/cccl

Length of output: 179


🏁 Script executed:

rg 'initializer_list' libcudacxx/include/cuda/std/variant -l

Repository: NVIDIA/cccl

Length of output: 37


🏁 Script executed:

find libcudacxx/include -name "variant*" -type f

Repository: NVIDIA/cccl

Length of output: 568


🏁 Script executed:

head -50 libcudacxx/include/cuda/std/variant | grep -n "include"

Repository: NVIDIA/cccl

Length of output: 433


🏁 Script executed:

rg '`#include`.*initializer_list' libcudacxx/include/cuda/std/ -A 1 -B 1

Repository: NVIDIA/cccl

Length of output: 5204


Add #include <cuda/std/initializer_list> explicitly. Lines 48, 53, and 90 construct cuda::std::initializer_list<int>{}, but this header is not included directly. Although it comes transitively through <cuda/std/variant>, the coding guideline requires including all headers for symbols you use; transitive header inclusion is not allowed.

Source: Coding guidelines

@github-actions

Copy link
Copy Markdown
Contributor

🥳 CI Workflow Results

🟩 Finished in 8h 30m: Pass: 100%/113 | Total: 6d 03h | Max: 4h 16m | Hits: 40%/2709280

See results here.

@wmaxey wmaxey merged commit b919cbd into NVIDIA:branch/3.4.x Jun 23, 2026
132 of 135 checks passed
@github-project-automation github-project-automation Bot moved this from In Review to Done in CCCL Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants