Support for CUDA 13.3

Still unreleased, but good to track cuTile Python commits related to it:
- `ct.exp(x, rounding_mode=...)`: expose RoundingMode.FULL/APPROX (https://github.com/NVIDIA/cutile-python/commit/b2d3f82)
- `@ct.kernel(num_worker_warps=...)`: entry hint for warp-specialized kernels (https://github.com/NVIDIA/cutile-python/commit/bfb2960)
- `ct.mma(..., use_fast_acc=True)`: fp8 MMA fast accumulator (https://github.com/NVIDIA/cutile-python/commit/0d172bf)
- `ct.atomic_add` on bfloat16 (sm_90+) (https://github.com/NVIDIA/cutile-python/commit/e517b6d)
- Tiled-view atomic ops: `tv.atomic_add`, `atomic_max`, `atomic_min`, `atomic_and`, `atomic_or`, `atomic_xor` (https://github.com/NVIDIA/cutile-python/commit/e517b6d)
- `ct.tiled_view(..., traversal_steps=...)` + load/store via `StridedView` (https://github.com/NVIDIA/cutile-python/commit/85da1e3)
- `ct.load_advanced_indexing` / `ct.store_advanced_indexing`: GatherScatterView via advanced indexing (https://github.com/NVIDIA/cutile-python/commit/c2360bd, renamed in https://github.com/NVIDIA/cutile-python/commit/d10a5da)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for CUDA 13.3 #188

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for CUDA 13.3 #188

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions