Add `version` validation: import-time assertions, unit tests, PyPI fallback prevention, CI hardening #1454

rwgk · 2026-01-10T02:14:49Z

This PR adds safety mechanisms to prevent version-related issues during builds and ensures version detection from git tags is working as intended.

Version Number Validation

Problem: setuptools-scm silently falls back to version 0.1.x when git tags are unavailable (e.g., due to shallow clones), which can lead to incorrect version detection and unexpected dependency resolution. Silent failures in the procedure producing __version__ can lead to highly confusing behavior and potentially invalidate elaborate QA testing.

Solution: We implement a two-layer defense strategy to catch invalid versions at multiple stages:

First Line of Defense: Import-Time Assertions

Fail-fast assertions are added immediately after importing __version__ in all three package __init__.py files:

cuda_bindings/cuda/bindings/__init__.py
cuda_core/cuda/core/__init__.py
cuda_pathfinder/cuda/pathfinder/__init__.py

Each file includes a minimal one-liner assertion:

assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1), "FATAL: invalid __version__"

This ensures that any attempt to import a package with an invalid version (e.g., 0.1.dev...) fails immediately with a clear error, preventing the package from being used at all. The assertion checks that major.minor > (0, 1), which is sufficient since all three packages are already at higher versions.

Second Line of Defense: Unit Tests

As a backup, we also implement late-stage detection via regular unit tests that validate version numbers after installation:

Adds validate_version_number() function to cuda_python_test_helpers for centralized validation logic
Creates minimal test_version_number.py files in cuda_bindings, cuda_core, and cuda_pathfinder that import the version and call the validation function
Tests are split into separate functions (test_bindings_version, test_core_version, test_pathfinder_version) so all invalid versions are reported in a single test run
Each test suite validates its own package version plus dependency versions (e.g., cuda_bindings tests check both cuda-bindings and cuda-pathfinder versions)
Provides clear error messages explaining the issue without referencing setuptools-scm internals

The unit tests run during the test phase and provide explicit test coverage with clearer error messages in CI logs.

Why Two Layers?

While the import-time assertions provide immediate feedback and prevent invalid packages from being imported, we maintain the unit tests as a second line of defense because:

Redundancy: If the assertions somehow fail to catch an issue (e.g., due to import path quirks or edge cases), the unit tests provide a backup check
Explicit Test Coverage: Unit tests make version validation an explicit, testable requirement rather than an implicit assertion
CI Visibility: Test failures in CI logs are more visible and easier to debug than import-time assertion failures
Defense in Depth: Silent failures in the procedure producing __version__ can lead to highly confusing behavior and potentially invalidate elaborate QA testing. Multiple detection points reduce the risk of invalid versions going undetected.

CI Workflow Hardening

To ensure version validation works correctly in CI environments, we've hardened the test workflows:

Intentional Shallow Clone: Test workflows explicitly use fetch-depth: 1 (the default) with a comment emphasizing that shallow cloning is intentional. This ensures we're testing wheel installation without full git history, which is the correct behavior for testing pre-built artifacts.
Wheel-Only Installation: The "Ensure cuda-python installable" step uses --only-binary=:all: to ensure we only test wheels, never build from source. This prevents pip from building packages from source when source code is present, which could lead to version issues in shallow clones.
Immediate Version Verification: After installing cuda-python, a new "Verify installed package versions" step imports cuda.pathfinder and cuda.bindings to trigger the __version__ assertions immediately. This provides early detection of invalid versions right after installation, before any tests run.

These changes ensure that:

We're testing the actual wheel artifacts, not building from source
Version issues are caught as early as possible in the CI pipeline
The test workflows are resilient even with shallow clones

Why Not Early-Stage Detection?

We initially attempted early-stage detection in build hooks (build_wheel, build_editable) to catch fallback versions during the build process. However, this approach proved too fragile:

Timing Issues: _version.py files are written by setuptools-scm during prepare_metadata_for_build_wheel, but validation needs to run at the right time in the build process. Attempting to validate too early results in "file not found" errors, while validating too late allows builds to complete with invalid versions.
Shallow Clone Handling: When setuptools-scm detects a shallow clone, it bypasses git_describe_command entirely and falls back to 0.0 or 0.1.x versions before our validation can run. This makes build-time detection unreliable in CI environments that use shallow clones.
Complexity: The build hook approach required careful coordination between PEP 517 hooks (prepare_metadata_for_build_wheel, build_wheel) and custom validation logic, making it error-prone and difficult to maintain.

Given these challenges, we decided to use the simpler and more certain approach: import-time assertions for immediate feedback, unit tests for explicit coverage, and CI workflow hardening to ensure the validation works correctly in all environments.

PyPI Fallback Prevention

Problem: During isolated (PEP 517) builds, a just-built cuda-bindings installation could be incorrectly replaced with a PyPI wheel if the installed version didn't match expectations.

Solution: Enhanced cuda_core/build_hooks.py to:

Check installed cuda-bindings version using direct import of cuda.bindings._version (note that importlib.metadata cannot be used in isolated environments)
Detect editable installs by checking if the _version.py file path is within the repository root
Prevent replacement of editable installs
Ensure version compatibility: if cuda-bindings is installed (non-editable) and its major version doesn't match the CUDA major version, raise an exception

This prevents accidental installation of incompatible cuda-bindings versions from PyPI during builds.

Piggy-backed:

Import Sorting Fix

Problem: Ruff's import sorting (I001) was inconsistently reordering the _version import in cuda_pathfinder/cuda/pathfinder/__init__.py, depending on whether _version.py exists or not (e.g., after git clean -fdx).

Solution: Added # isort: skip directive to the _version import line to prevent ruff from moving it. This ensures consistent import ordering regardless of build state.

Note on setuptools-scm RuntimeWarning

We've observed RuntimeWarnings from setuptools-scm that incorrectly display package versions instead of setuptools versions (e.g., ERROR: setuptools==0.5.1.dev20+gf8dddb370 is used in combination with setuptools-scm>=8.x). This appears to be a known issue in setuptools-scm when using custom build backends (see setuptools-scm issue #1192). A minimal reproducer has been created at github.com/rwgk/setuptools-scm-issue-1192.

These warnings don't affect functionality but are noisy. They occur on both main and this branch.

Add pre-build validation that checks git tag availability directly to ensure builds fail early with clear error messages before setuptools-scm silently falls back to version '0.1.x'. Changes: - cuda_bindings/setup.py: Validate tags at import time (before setuptools-scm) - cuda_core/build_hooks.py: Validate tags in _build_cuda_core() before building - cuda_pathfinder/build_hooks.py: New custom build backend that validates tags before delegating to setuptools.build_meta - cuda_pathfinder/pyproject.toml: Configure custom build backend Benefits: - Fails immediately when pip install -e . is run, not during build - More direct: tests what setuptools-scm actually needs (git describe) - Cleaner: no dependency on generated files - Better UX: clear error messages with actionable fixes Error messages include: - Clear explanation of the problem - The actual git error output - Common causes (tags not fetched, wrong directory, etc.) - Package-specific debugging commands - Actionable fix: git fetch --tags

Add validation in _get_cuda_bindings_require() to check if cuda-bindings is already installed and validate its version compatibility. Strategy: - If cuda-bindings is not installed: require matching CUDA major version - If installed from sources (editable): keep it regardless of version - If installed from wheel: validate major version matches CUDA major - Raise clear error if version mismatch detected This prevents accidentally using PyPI versions that don't match the CUDA toolkit version being used for compilation. Changes: - Add _check_cuda_bindings_installed() to detect installation status - Check for editable installs via direct_url.json, repo location, or .egg-link - Validate version compatibility in _get_cuda_bindings_require() - Move imports to module level (PEP 8 compliance) - Add noqa: S110 for broad exception handling (intentional)

Remove Methods 2 and 3 for detecting editable installs, keeping only PEP 610 (direct_url.json) which is the standard for Python 3.10+ and pip 20.1+. Changes: - Remove Method 2: import cuda.bindings to check repo location (problematic during build requirement phase) - Remove Method 3: .egg-link file detection (obsolete for Python 3.10+) - Keep only PEP 610 method (direct_url.json) which is reliable and doesn't require importing modules during build This fixes build errors caused by importing cuda.bindings during the build requirement phase, which interfered with Cython compilation.

…s detection Replace importlib.metadata.distribution() with direct import of cuda.bindings._version module. The former may incorrectly return the cuda-core distribution when queried for 'cuda-bindings' in isolated build environments (tested with Python 3.12 and pip 25.3). This may be due to cuda-core metadata being written during the build process before cuda-bindings is fully available, causing importlib.metadata to return the wrong distribution. Also ensure cuda-bindings is always required in build environment by returning ['cuda-bindings'] instead of [] when already installed. This ensures pip makes it available in isolated build environments even if installed elsewhere. Fix import sorting inconsistency for _version import in cuda_pathfinder by adding 'isort: skip' directive.

Make _validate_git_tags_available() take tag_pattern as parameter and ensure all three implementations (cuda-core, cuda-pathfinder, cuda-bindings) are identical. Add sync comments to remind maintainers to keep them in sync. Also fix ruff noqa comments: S603 on subprocess.run() line, S607 on list argument line.

copy-pr-bot · 2026-01-10T02:14:53Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

rwgk · 2026-01-10T02:15:52Z

/ok to test

github-actions · 2026-01-10T02:25:41Z

Doc Preview CI
🚀 View preview at https://nvidia.github.io/cuda-python/pr-preview/pr-1454/
https://nvidia.github.io/cuda-python/pr-preview/pr-1454/cuda-core/
https://nvidia.github.io/cuda-python/pr-preview/pr-1454/cuda-bindings/
https://nvidia.github.io/cuda-python/pr-preview/pr-1454/cuda-pathfinder/
Preview will be ready when the GitHub Pages deployment is complete.

Replace _validate_git_tags_available() functions with DRY shim/runner pattern: - Create scripts/git_describe_command_runner.py: shared implementation - Create git_describe_command_shim.py in each package: thin wrappers that check for scripts/ directory and delegate to the runner - Update pyproject.toml files to use git_describe_command_shim.py - Remove all three copies of _validate_git_tags_available() from build_hooks.py and setup.py Benefits: - DRY: single implementation in scripts/ - Portable: Python is always available (no git in PATH requirement) - Clear error messages: shims check for scripts/ and provide context - No import-time validation: only runs when setuptools-scm calls it - Cleaner code: cuda_pathfinder/build_hooks.py is now just 13 lines All three shim files are identical copies and must be kept in sync.

rwgk · 2026-01-10T21:16:05Z

/ok to test

Remove unnecessary shim files and use scripts/git_describe_wrapper.py directly. Since setuptools_scm runs from repo root (root = ".."), we can call the shared wrapper script directly without package-specific shims. Changes: - Remove all three git_describe_command_shim.py files - Update pyproject.toml files to use scripts/git_describe_wrapper.py - Remove cuda_pathfinder/build_hooks.py (was just pass-through) - Remove pre-commit hook for checking shim files - Rename git_describe_command_runner.py to git_describe_wrapper.py This simplifies the codebase while maintaining the same functionality: - Single shared implementation for git describe - Clear error messages when tags are missing - Works correctly from repo root where setuptools-scm runs

rwgk · 2026-01-11T02:36:21Z

/ok to test

rwgk · 2026-01-11T05:37:26Z

setuptools-scm Shallow Clone Fallback Issue (exists already on `main`)

Summary

During CI testing, we discovered that cuda-pathfinder is being built with fallback version 0.1.dev1+g1b2e4c088 instead of the expected 1.3.4.dev77+g1b2e4c088 in some Windows test runs. This occurs when setuptools-scm detects a shallow Git clone and silently falls back to the default version without calling our git_describe_wrapper.py.

Root Cause

Shallow Clone Detection: setuptools-scm detects shallow clones early in its version detection process
Early Fallback: When a shallow clone is detected, setuptools-scm bypasses git_describe_command entirely and falls back to 0.0 or 0.1.x versions
Wrapper Never Called: Our git_describe_wrapper.py is never executed because setuptools-scm decides to skip version detection before attempting to call it

Evidence

Test Results

The fallback version 0.1.dev1+g1b2e4c088 was detected in 18 Windows test files (all Windows tests):

Test_win-64___py3.10__12.9.1__wheels__rtx2080__WDDM_.txt
Test_win-64___py3.10__13.0.2__local__rtxpro6000__TCC_.txt
Test_win-64___py3.10__13.1.0__local__rtxpro6000__TCC_.txt
Test_win-64___py3.11__12.9.1__local__v100__MCDM_.txt
Test_win-64___py3.11__13.0.2__wheels__rtx4090__WDDM_.txt
Test_win-64___py3.11__13.1.0__wheels__rtx4090__WDDM_.txt
Test_win-64___py3.12__12.9.1__wheels__l4__MCDM_.txt
Test_win-64___py3.12__13.0.2__local__a100__TCC_.txt
Test_win-64___py3.12__13.1.0__local__a100__TCC_.txt
Test_win-64___py3.13__12.9.1__local__l4__TCC_.txt
Test_win-64___py3.13__13.0.2__wheels__rtxpro6000__MCDM_.txt
Test_win-64___py3.13__13.1.0__wheels__rtxpro6000__MCDM_.txt
Test_win-64___py3.14__12.9.1__wheels__v100__TCC_.txt
Test_win-64___py3.14__13.0.2__local__l4__MCDM_.txt
Test_win-64___py3.14__13.1.0__local__l4__MCDM_.txt
Test_win-64___py3.14t__12.9.1__local__l4__TCC_.txt
Test_win-64___py3.14t__13.0.2__wheels__a100__MCDM_.txt
Test_win-64___py3.14t__13.1.0__wheels__a100__MCDM_.txt

Notes:

All affected tests are Windows-based. Linux tests appear to have proper version numbers.
We found 0.1.dev also in the same 18 CI logs from the latest run on main

Log Evidence

From Test_win-64___py3.14t__12.9.1__local__l4__TCC_.txt:

2026-01-11T03:06:35.1995098Z   C:\Windows\Temp\pip-build-env-qw2xma_m\overlay\Lib\site-packages\setuptools_scm\git.py:202: UserWarning: "C:\actions-runner\_work\cuda-python\cuda-python" is shallow and may cause errors
2026-01-11T03:06:35.1995894Z     warnings.warn(f'"{wd.path}" is shallow and may cause errors')
2026-01-11T03:06:36.0998424Z   Created wheel for cuda-pathfinder: filename=cuda_pathfinder-0.1.dev1+g1b2e4c088-py3-none-any.whl
2026-01-11T03:06:54.4353416Z cuda-bindings 12.9.6.dev2+g563cd83db requires cuda-pathfinder~=1.1, but you have cuda-pathfinder 0.1.dev1+g1b2e4c088 which is incompatible.

Local Testing

We tested locally with a shallow clone and confirmed:

Wrapper Not Called: Even with git_describe_command configured, the wrapper script is never executed
Silent Fallback: setuptools-scm detects the shallow clone, warns about it, but proceeds with fallback version 0.0 or 0.1.x
No Error: The build succeeds with the wrong version, causing dependency conflicts

setuptools-scm Behavior

From setuptools-scm source code (git.py):

def version_from_describe(...):
    if describe_command is not None:
        # Only called if setuptools-scm decides to attempt version detection
        describe_res = _run(describe_command, wd.path)
    else:
        describe_res = wd.default_describe()
    
    return describe_res.parse_success(parse=parse_describe)

# In _git_parse_inner:
version = version_from_describe(wd, config, describe_command)

if version is None:
    # Falls back to 0.0 or configured fallback_version
    tag = config.version_cls(config.fallback_version or "0.0")
    # ... creates version with fallback

Key Finding: When setuptools-scm detects a shallow clone early (via is_shallow()), it may skip calling git_describe_command entirely and go straight to the fallback.

Impact

CI Testing: Windows CI runs are building cuda-pathfinder with incorrect versions
Dependency Conflicts: The fallback version 0.1.dev1 doesn't satisfy cuda-bindings requirement ~=1.1
SWQA Team Risk: If SWQA uses shallow clones, they'll encounter the same issue
Silent Failure: The build succeeds but with wrong version, making it hard to detect

Attempted Solutions

1. git_describe_wrapper.py Enhancement

We enhanced scripts/git_describe_wrapper.py to detect shallow clones proactively:

# Check if repository is shallow
result = subprocess.run(
    ["git", "rev-parse", "--is-shallow-repository"],
    capture_output=True,
    text=True,
    timeout=5,
)
if result.returncode == 0 and result.stdout.strip() == "true":
    print("ERROR: Repository is a shallow clone.", file=sys.stderr)
    sys.exit(1)

Result: ❌ Doesn't work - The wrapper is never called when setuptools-scm detects a shallow clone.

2. setuptools-scm `fail_on_shallow` Option

setuptools-scm provides a pre_parse = "fail_on_shallow" option that should fail builds on shallow clones.

Result: ❌ Didn't work in our tests - May require different configuration or setuptools-scm version.

References

setuptools-scm source: https://github.com/pypa/setuptools-scm
Git shallow clone: https://git-scm.com/docs/git-clone#Documentation/git-clone.txt---depthltdepthgt

Add build-time validation to detect when setuptools-scm falls back to default versions (0.0.x or 0.1.dev*) due to shallow clones or missing git tags. This prevents silent failures that cause dependency conflicts. Changes: - scripts/validate_version.py: New DRY validation script that checks for fallback versions and validates against expected patterns - cuda_core/build_hooks.py: Add validation in prepare_metadata hooks - cuda_pathfinder/build_hooks.py: New build hooks with version validation - cuda_pathfinder/pyproject.toml: Use custom build_hooks backend - cuda_bindings/setup.py: Add ValidateVersion command class The validation runs after setuptools-scm generates _version.py files, ensuring we catch fallback versions before builds complete. This will cause the 18 Windows CI tests that currently use fallback versions to fail with clear error messages instead of silently using wrong versions. Related to shallow clone issue documented in PR NVIDIA#1454.

…etadata Move validation from prepare_metadata_for_build_* to build_editable/build_wheel where _version.py definitely exists. This fixes build failures where validation ran before setuptools-scm wrote the version file.

rwgk · 2026-01-11T06:36:38Z

/ok to test

rwgk · 2026-01-11T06:50:36Z

/ok to test

This commit adds version number validation tests to detect when automatic version detection fails and falls back to invalid versions (e.g., 0.0.x or 0.1.dev*). This addresses two main concerns: 1. Fallback version numbers going undetected: When setuptools-scm cannot detect version from git tags (e.g., due to shallow clones), it silently falls back to default versions like 0.1.dev*. These invalid versions can cause dependency conflicts and confusion in production/SWQA environments. 2. PyPI wheel replacement: The critical issue of just-built cuda-bindings being replaced with PyPI wheels is already handled by _check_cuda_bindings_installed() in cuda_core/build_hooks.py. Rather than attempting complex early detection in build hooks (which proved fragile due to timing issues with when _version.py files are written), we implement late-stage detection via test files. This approach is: - Simpler: No complex build hook timing issues - Reliable: Tests run after installation when versions are definitely available - Sufficient: Catches issues before they reach production/SWQA Changes: - Add validate_version_number() function to cuda_python_test_helpers for centralized validation logic - Create minimal test_version_number.py files in cuda_bindings, cuda_core, and cuda_pathfinder that import the version and call the validation function - Add helpers/__init__.py files in cuda_bindings/tests and cuda_pathfinder/tests to enable importing from cuda_python_test_helpers - Update cuda_core/tests/helpers/__init__.py to use ModuleNotFoundError instead of ImportError for consistency The validation checks that versions have major.minor > 0.1, which is sufficient since all three packages are already at higher versions. Error messages explain the issue without referencing setuptools-scm internals.

rwgk · 2026-01-11T23:05:44Z

/ok to test

The supports_ipc_mempool function was previously defined in cuda_python_test_helpers, but it had a hard dependency on cuda.core._utils.cuda_utils.handle_return. This caused CI failures when cuda_bindings or cuda_pathfinder tests tried to import cuda_python_test_helpers, because cuda-core might not be installed in those test environments. By moving supports_ipc_mempool to cuda_core/tests/helpers/__init__.py, we ensure that: - cuda_python_test_helpers remains free of cuda-core-specific dependencies - The function is only available where cuda-core is guaranteed to be installed (i.e., in cuda_core tests) - cuda_bindings and cuda_pathfinder can safely import cuda_python_test_helpers without requiring cuda-core Changes: - Move supports_ipc_mempool from cuda_python_test_helpers to cuda_core/tests/helpers/__init__.py - Update cuda_core/tests/test_memory.py to import from helpers instead of cuda_python_test_helpers - Remove unused imports (functools, Union, handle_return) from cuda_python_test_helpers/__init__.py - Remove supports_ipc_mempool from cuda_python_test_helpers __all__ This fixes CI failures where importing cuda_python_test_helpers would fail due to missing cuda-core dependencies.

rwgk · 2026-01-11T23:46:04Z

/ok to test

The main purpose of these tests is to validate that dependencies have valid version numbers, not just the package being tested. This is critical for catching cases where a dependency (e.g., cuda-pathfinder) might be built with a fallback version (0.1.dev...) due to shallow git clones or missing tags. To ensure we see all invalid versions in a single test run, we organize the tests as separate test functions (test_bindings_version, test_core_version, test_pathfinder_version) rather than combining them into a single function. This way, if multiple packages have invalid versions, pytest will report all failures rather than stopping at the first one. Changes: - cuda_bindings/tests/test_version_number.py: Tests both cuda-bindings and cuda-pathfinder versions - cuda_core/tests/test_version_number.py: Tests cuda-bindings, cuda-core, and cuda-pathfinder versions - cuda_pathfinder/tests/test_version_number.py: Tests cuda-pathfinder version (renamed function for consistency)

rwgk · 2026-01-12T00:51:05Z

/ok to test

Add minimal assertions immediately after importing __version__ in all three package __init__.py files to fail fast if an invalid version (e.g., 0.1.dev...) is detected. This prevents packages with fallback versions from being imported or used, catching the issue at the earliest possible point. The assertion checks that major.minor > (0, 1) using a minimal one-liner: assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1), "FATAL: invalid __version__" Strictly speaking this makes the unit tests redundant, but we want to keep the unit tests as a second line of defense. The assertions provide immediate feedback during import, while the unit tests provide explicit test coverage and clearer error messages in CI logs. Changes: - cuda_bindings/cuda/bindings/__init__.py: Add version assertion - cuda_core/cuda/core/__init__.py: Add version assertion - cuda_pathfinder/cuda/pathfinder/__init__.py: Add version assertion

Use an intentionally shallow clone (fetch-depth: 1) to test wheel installation without full git history. This ensures we're testing the wheel artifacts themselves, not building from source. Changes: - Set fetch-depth: 1 explicitly (although it is the default) with comment emphasizing that shallow cloning is intentional - Add --only-binary=:all: to cuda-python installation to ensure we only test wheels, never build from source - Add "Verify installed package versions" step that imports cuda.pathfinder and cuda.bindings to trigger __version__ assertions immediately after installation, providing early detection of invalid versions - Update comments to accurately reflect that we're testing wheel artifacts This approach hardens the test workflows by: - Making the shallow clone intentional and explicit - Actually testing that __version__ assertions work (fail-fast on invalid versions) - Catching version issues immediately after installation, before tests run - Ensuring we only test wheels, not source builds Applied consistently to both: - .github/workflows/test-wheel-windows.yml - .github/workflows/test-wheel-linux.yml

rwgk · 2026-01-12T05:38:00Z

/ok to test

rwgk · 2026-01-12T07:03:28Z

CI Logs Analysis: PR #1454 vs Main Branch

Summary

After carefully analyzing the CI logs from PR #1454 (/wrk/logs_20909188905) and comparing them to the main branch logs (/wrk/main_logs_20871705999), I found that:

0.1.dev versions still appear in both PR and main branch logs (18 Windows test files in each)
However, they are caught and replaced before tests run
All version tests pass - no invalid versions reach the actual test execution
The root cause is a workflow bug in the Windows test workflow

Detailed Findings

0.1.dev Versions Still Appear

PR Logs: Found cuda-pathfinder-0.1.dev1+g233447b1b in 18 Windows test files
Main Logs: Found cuda-pathfinder-0.1.dev1+gacc78f7c0 in 18 Windows test files

The same number of occurrences suggests this is a pre-existing issue, not something introduced by the PR.

Where They Come From

The bad versions are built during the "Install cuda.pathfinder extra wheels for testing" step. The issue is in the Windows workflow:

Windows workflow (.github/workflows/test-wheel-windows.yml line 259):

pip install --only-binary=:all: -v . --group "test-cu${TEST_CUDA_MAJOR}"

Linux workflow (.github/workflows/test-wheel-linux.yml line 292):

pip install --only-binary=:all: -v ./*.whl --group "test-cu${TEST_CUDA_MAJOR}"

The Problem: When pip processes . (a directory), it needs to prepare metadata to determine dependencies, which triggers setuptools-scm to build from source (in a shallow clone, resulting in 0.1.dev1). The --only-binary=:all: flag doesn't prevent metadata preparation - it only prevents installing from source distributions.

When pip processes ./*.whl, it installs directly from the wheel file without needing to prepare metadata, avoiding the source build entirely.

What Happens Next

The sequence in affected Windows tests:

Good version installed first: cuda-pathfinder-1.3.4.dev85+g233447b1b (from wheel artifact)
Version verification passes: The "Verify installed package versions" step successfully imports cuda.pathfinder and cuda.bindings (lines 15404-15405)
Bad version built: During "Install cuda.pathfinder extra wheels", pip builds 0.1.dev1 from source due to the . directory issue
Bad version installed: Successfully installed cuda-pathfinder-0.1.dev1+g233447b1b (line 15677)
Bad version replaced: Uninstalling cuda-pathfinder-0.1.dev1+g233447b1b then Successfully installed cuda-pathfinder-1.3.4.dev85+g233447b1b (lines 15758-15760)
Tests run with good version: All test_version_number tests pass

Version Tests Status

All version tests pass in both PR and main branch logs:

test_bindings_version PASSED
test_core_version PASSED
test_pathfinder_version PASSED

This confirms that:

The import-time assertions work correctly (version verification step passes)
The unit tests work correctly (all version tests pass)
No invalid versions reach actual test execution

Comparison: PR vs Main

Metric	PR Logs	Main Logs
Files with `0.1.dev` versions	18 (Windows)	18 (Windows)
Version test failures	0	0
Import-time assertion failures	0	0

Key Difference: The PR has the "Verify installed package versions" step that explicitly tests the import-time assertions, providing early detection. The main branch doesn't have this step, but also doesn't fail because the bad versions get replaced before tests run.

Remaining Issue

While the bad versions are caught and replaced, they still cause:

Dependency conflict warnings: cuda-bindings requires cuda-pathfinder~=1.1, but you have cuda-pathfinder 0.1.dev1+g233447b1b which is incompatible
Inefficiency: Unnecessary source builds and reinstallations
Risk: If the replacement step were to fail, tests would run with the bad version

Recommended Fix

Change the Windows workflow to match the Linux workflow:

- pip install --only-binary=:all: -v . --group "test-cu${TEST_CUDA_MAJOR}"
+ pip install --only-binary=:all: -v ./*.whl --group "test-cu${TEST_CUDA_MAJOR}"

This will prevent pip from building from source during metadata preparation, eliminating the temporary 0.1.dev versions entirely.

Conclusion

The PR's version validation mechanisms are working correctly:

✅ Import-time assertions catch invalid versions immediately
✅ Unit tests provide explicit coverage and pass
✅ CI workflow hardening provides early detection

The remaining 0.1.dev versions are a pre-existing workflow bug (Windows using . instead of ./*.whl) that causes temporary bad versions, but they are caught and replaced before tests run. The fix is straightforward: update the Windows workflow to use ./*.whl instead of ..

Change pip install command from '.' to './*.whl' to prevent pip from building from source during metadata preparation. This matches the Linux workflow and eliminates the temporary 0.1.dev versions that were being built in shallow clones. See: NVIDIA#1454 (comment)

rwgk · 2026-01-12T07:08:12Z

/ok to test

rwgk · 2026-01-12T08:02:02Z

CI Logs Analysis: PR #1454 (post commit `c574a94`)

CI Run: /wrk/logs_20910873985

Executive Summary

✅ All checks passed successfully. The fix to use ./*.whl instead of . in the Windows workflow has eliminated the 0.1.dev fallback versions that were previously appearing. All version validation mechanisms are working as intended.

Key Findings

1. No `0.1.dev` Package Versions Detected

Search Results: Comprehensive grep across all log files found zero instances of cuda-pathfinder, cuda-bindings, or cuda-core packages with 0.1.dev versions.

All installed packages show valid versions:
- cuda-pathfinder-1.3.4.dev86+gc574a94c2
- cuda-bindings-13.1.2.dev72+gc574a94c2
- cuda-core-0.5.1.dev39+gc574a94c2

2. Windows Workflow Fix Confirmed

The fix to change pip install --only-binary=:all: -v . to pip install --only-binary=:all: -v ./*.whl is working correctly:

Example from Test_win-64___py3.14t__13.1.0__wheels__a100__MCDM_.txt:

2026-01-12T07:27:30.4348704Z + pip install --only-binary=:all: -v ./cuda_pathfinder-1.3.4.dev86+gc574a94c2-py3-none-any.whl --group test-cu13
2026-01-12T07:27:31.1679583Z Processing c:\actions-runner\_work\cuda-python\cuda-python\cuda_pathfinder\cuda_pathfinder-1.3.4.dev86+gc574a94c2-py3-none-any.whl

Pip is now correctly processing the wheel file directly, preventing any source builds that would trigger setuptools-scm in shallow clones.

3. Version Validation Tests Passing

All version validation tests are passing across all platforms:

Windows Examples:

tests/test_version_number.py::test_pathfinder_version PASSED
tests/test_version_number.py::test_bindings_version PASSED
tests/test_version_number.py::test_core_version PASSED

Linux Examples:

tests/test_version_number.py::test_pathfinder_version PASSED
tests/test_version_number.py::test_bindings_version PASSED
tests/test_version_number.py::test_core_version PASSED

4. Import-Time Verification Steps Executing

The CI workflow hardening steps are executing successfully:

Windows (Test_win-64___py3.14t__13.1.0__wheels__a100__MCDM_.txt):

2026-01-12T07:27:30.1648632Z + python -c 'import cuda.pathfinder'
2026-01-12T07:27:30.2797131Z + python -c 'import cuda.bindings'

Linux (Test_linux-64___py3.14t__13.1.0__local__l4.txt):

2026-01-12T07:21:49.6369200Z + python -c 'import cuda.pathfinder'
2026-01-12T07:21:49.7363396Z + python -c 'import cuda.bindings'

These steps successfully trigger the import-time assertions in __init__.py files, providing immediate feedback if invalid versions are present.

5. No Dependency Conflicts

Search Results: No dependency conflict warnings related to version mismatches. The only "incompatible" matches found were from unrelated test names (test_from_buffer_incompatible_dtype_and_itemsize), which are expected test cases.

6. No Assertion Errors

Search Results: Zero instances of:

FATAL: invalid __version__
AssertionError.*version
Invalid version number detected

This confirms that:

All packages have valid versions
Import-time assertions are not triggering
Unit tests are not encountering invalid versions

Comparison with Previous Analysis

Previous Issue (Logs `/wrk/logs_20909188905`)

In the previous CI run, we identified that Windows workflows were building cuda-pathfinder from source with 0.1.dev versions due to pip treating . as a source distribution, even with --only-binary=:all:. The bad versions were then replaced by wheels before tests ran, masking the issue.

Current State (Logs `/wrk/logs_20910873985`)

✅ Issue Resolved: The fix to use ./*.whl instead of . ensures pip installs directly from wheel files, preventing any source builds in shallow clones.

Build Workflow Analysis

Build Jobs

All build jobs show cuda-pathfinder being built from source (expected behavior in build workflows):

Building wheels for collected packages: cuda-pathfinder

However, these builds occur in contexts where full git history is available (build jobs fetch tags), so setuptools-scm correctly generates versions like 1.3.4.dev86+gc574a94c2.

Test Jobs (Wheel Installation)

Test jobs that install from wheels show:

Processing c:\actions-runner\_work\cuda-python\cuda-python\cuda_pathfinder\cuda_pathfinder-1.3.4.dev86+gc574a94c2-py3-none-any.whl
No source builds triggered
All packages installed with correct versions

Two-Layer Defense Verification

Layer 1: Import-Time Assertions ✅

The assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1) statements in __init__.py files are:

Present in all three packages (cuda.bindings, cuda.core, cuda.pathfinder)
Not triggering (no assertion errors in logs)
Verified by the explicit import steps in CI workflows

Layer 2: Unit Tests ✅

The test_version_number.py tests are:

Running successfully across all platforms
Passing for all three packages
Providing granular reporting (separate tests for each package)

Conclusion

The PR's version validation strategy is working as designed:

✅ Import-time assertions provide fail-fast detection of invalid versions
✅ Unit tests provide explicit coverage and granular reporting
✅ CI workflow hardening ensures wheel-only installation in test workflows
✅ Windows workflow fix prevents source builds that would trigger setuptools-scm fallback in shallow clones

No issues detected. The PR is ready for review.

Recommendations

✅ Ready for merge: All validation mechanisms are functioning correctly
✅ Monitoring: Continue monitoring CI logs for any future 0.1.dev versions (should not occur with current fixes)
✅ Documentation: The PR description accurately reflects the implemented solution

copy-pr-bot · 2026-01-12T08:03:58Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

kkraus14 · 2026-01-12T16:23:07Z

cuda_bindings/cuda/bindings/__init__.py

 from cuda.bindings import utils
 from cuda.bindings._version import __version__
+
+assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1), "FATAL: invalid __version__"


I don't think we should do this outside of an opt-in debug mechanism. If someone grabbed the source code in some weird way that breaks the version resolution via setuptools_scm, we don't want it to error for them unnecessarily, where getting an "incorrect" version would be better than getting this assertion.

kkraus14 · 2026-01-12T16:23:40Z

cuda_core/cuda/core/__init__.py


 import importlib

+assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1), "FATAL: invalid __version__"


kkraus14 · 2026-01-12T16:24:14Z

cuda_core/tests/helpers/__init__.py

+@functools.cache
+def supports_ipc_mempool(device_id: Union[int, object]) -> bool:


Do we need to move this helper in this PR? It would ideally be done in a separate PR.

kkraus14 · 2026-01-12T16:26:24Z

cuda_pathfinder/cuda/pathfinder/__init__.py


+from cuda.pathfinder._version import __version__  # isort: skip
+
+assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1), "FATAL: invalid __version__"


rwgk added 5 commits January 9, 2026 13:30

rwgk self-assigned this Jan 10, 2026

rwgk added 2 commits January 10, 2026 13:07

Merge branch 'main' into version-safety-checks

255511e

rwgk force-pushed the version-safety-checks branch from 4dfd289 to 1b2e4c0 Compare January 11, 2026 02:35

rwgk added 2 commits January 10, 2026 21:55

rwgk added 2 commits January 11, 2026 12:14

Add skipif IS_WINDOWS for test_patterngen_seeds

475cab3

rwgk added 2 commits January 11, 2026 20:23

rwgk changed the title ~~Implement build-time version validation: git tag checks and PyPI fallback prevention~~ Add __version__ validation: import-time assertions, unit tests, PyPI fallback prevention, CI hardening Jan 12, 2026

rwgk marked this pull request as ready for review January 12, 2026 08:03

rwgk requested a review from mdboom January 12, 2026 08:04

kkraus14 reviewed Jan 12, 2026

View reviewed changes


		import importlib

		assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1), "FATAL: invalid __version__"

		@functools.cache
		def supports_ipc_mempool(device_id: Union[int, object]) -> bool:


		from cuda.pathfinder._version import __version__ # isort: skip

		assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1), "FATAL: invalid __version__"

Add __version__ validation: import-time assertions, unit tests, PyPI fallback prevention, CI hardening #1454

Are you sure you want to change the base?

Add __version__ validation: import-time assertions, unit tests, PyPI fallback prevention, CI hardening #1454

Conversation

rwgk commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Version Number Validation

First Line of Defense: Import-Time Assertions

Second Line of Defense: Unit Tests

Why Two Layers?

CI Workflow Hardening

Why Not Early-Stage Detection?

PyPI Fallback Prevention

Import Sorting Fix

Note on setuptools-scm RuntimeWarning

Uh oh!

copy-pr-bot bot commented Jan 10, 2026

Uh oh!

rwgk commented Jan 10, 2026

Uh oh!

github-actions bot commented Jan 10, 2026

Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

rwgk commented Jan 10, 2026

Uh oh!

rwgk commented Jan 11, 2026

Uh oh!

rwgk commented Jan 11, 2026

setuptools-scm Shallow Clone Fallback Issue (exists already on main)

Summary

Root Cause

Evidence

Test Results

Log Evidence

Local Testing

setuptools-scm Behavior

Impact

Attempted Solutions

1. git_describe_wrapper.py Enhancement

2. setuptools-scm fail_on_shallow Option

References

Uh oh!

rwgk commented Jan 11, 2026

Uh oh!

rwgk commented Jan 11, 2026

Uh oh!

rwgk commented Jan 11, 2026

Uh oh!

rwgk commented Jan 11, 2026

Uh oh!

rwgk commented Jan 12, 2026

Uh oh!

rwgk commented Jan 12, 2026

Uh oh!

rwgk commented Jan 12, 2026

CI Logs Analysis: PR #1454 vs Main Branch

Summary

Detailed Findings

0.1.dev Versions Still Appear

Where They Come From

What Happens Next

Version Tests Status

Comparison: PR vs Main

Remaining Issue

Recommended Fix

Conclusion

Uh oh!

rwgk commented Jan 12, 2026

Uh oh!

rwgk commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI Logs Analysis: PR #1454 (post commit c574a94)

Executive Summary

Key Findings

1. No 0.1.dev Package Versions Detected

2. Windows Workflow Fix Confirmed

3. Version Validation Tests Passing

4. Import-Time Verification Steps Executing

5. No Dependency Conflicts

6. No Assertion Errors

Add `version` validation: import-time assertions, unit tests, PyPI fallback prevention, CI hardening #1454

Add `version` validation: import-time assertions, unit tests, PyPI fallback prevention, CI hardening #1454

rwgk commented Jan 10, 2026 •

edited

Loading

setuptools-scm Shallow Clone Fallback Issue (exists already on `main`)

2. setuptools-scm `fail_on_shallow` Option

rwgk commented Jan 12, 2026 •

edited

Loading

CI Logs Analysis: PR #1454 (post commit `c574a94`)

1. No `0.1.dev` Package Versions Detected

Previous Issue (Logs `/wrk/logs_20909188905`)

Current State (Logs `/wrk/logs_20910873985`)