-
Notifications
You must be signed in to change notification settings - Fork 236
Add __version__ validation: import-time assertions, unit tests, PyPI fallback prevention, CI hardening
#1454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add pre-build validation that checks git tag availability directly to ensure builds fail early with clear error messages before setuptools-scm silently falls back to version '0.1.x'. Changes: - cuda_bindings/setup.py: Validate tags at import time (before setuptools-scm) - cuda_core/build_hooks.py: Validate tags in _build_cuda_core() before building - cuda_pathfinder/build_hooks.py: New custom build backend that validates tags before delegating to setuptools.build_meta - cuda_pathfinder/pyproject.toml: Configure custom build backend Benefits: - Fails immediately when pip install -e . is run, not during build - More direct: tests what setuptools-scm actually needs (git describe) - Cleaner: no dependency on generated files - Better UX: clear error messages with actionable fixes Error messages include: - Clear explanation of the problem - The actual git error output - Common causes (tags not fetched, wrong directory, etc.) - Package-specific debugging commands - Actionable fix: git fetch --tags
Add validation in _get_cuda_bindings_require() to check if cuda-bindings is already installed and validate its version compatibility. Strategy: - If cuda-bindings is not installed: require matching CUDA major version - If installed from sources (editable): keep it regardless of version - If installed from wheel: validate major version matches CUDA major - Raise clear error if version mismatch detected This prevents accidentally using PyPI versions that don't match the CUDA toolkit version being used for compilation. Changes: - Add _check_cuda_bindings_installed() to detect installation status - Check for editable installs via direct_url.json, repo location, or .egg-link - Validate version compatibility in _get_cuda_bindings_require() - Move imports to module level (PEP 8 compliance) - Add noqa: S110 for broad exception handling (intentional)
Remove Methods 2 and 3 for detecting editable installs, keeping only PEP 610 (direct_url.json) which is the standard for Python 3.10+ and pip 20.1+. Changes: - Remove Method 2: import cuda.bindings to check repo location (problematic during build requirement phase) - Remove Method 3: .egg-link file detection (obsolete for Python 3.10+) - Keep only PEP 610 method (direct_url.json) which is reliable and doesn't require importing modules during build This fixes build errors caused by importing cuda.bindings during the build requirement phase, which interfered with Cython compilation.
…s detection Replace importlib.metadata.distribution() with direct import of cuda.bindings._version module. The former may incorrectly return the cuda-core distribution when queried for 'cuda-bindings' in isolated build environments (tested with Python 3.12 and pip 25.3). This may be due to cuda-core metadata being written during the build process before cuda-bindings is fully available, causing importlib.metadata to return the wrong distribution. Also ensure cuda-bindings is always required in build environment by returning ['cuda-bindings'] instead of [] when already installed. This ensures pip makes it available in isolated build environments even if installed elsewhere. Fix import sorting inconsistency for _version import in cuda_pathfinder by adding 'isort: skip' directive.
Make _validate_git_tags_available() take tag_pattern as parameter and ensure all three implementations (cuda-core, cuda-pathfinder, cuda-bindings) are identical. Add sync comments to remind maintainers to keep them in sync. Also fix ruff noqa comments: S603 on subprocess.run() line, S607 on list argument line.
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
|
Replace _validate_git_tags_available() functions with DRY shim/runner pattern: - Create scripts/git_describe_command_runner.py: shared implementation - Create git_describe_command_shim.py in each package: thin wrappers that check for scripts/ directory and delegate to the runner - Update pyproject.toml files to use git_describe_command_shim.py - Remove all three copies of _validate_git_tags_available() from build_hooks.py and setup.py Benefits: - DRY: single implementation in scripts/ - Portable: Python is always available (no git in PATH requirement) - Clear error messages: shims check for scripts/ and provide context - No import-time validation: only runs when setuptools-scm calls it - Cleaner code: cuda_pathfinder/build_hooks.py is now just 13 lines All three shim files are identical copies and must be kept in sync.
|
/ok to test |
Remove unnecessary shim files and use scripts/git_describe_wrapper.py directly. Since setuptools_scm runs from repo root (root = ".."), we can call the shared wrapper script directly without package-specific shims. Changes: - Remove all three git_describe_command_shim.py files - Update pyproject.toml files to use scripts/git_describe_wrapper.py - Remove cuda_pathfinder/build_hooks.py (was just pass-through) - Remove pre-commit hook for checking shim files - Rename git_describe_command_runner.py to git_describe_wrapper.py This simplifies the codebase while maintaining the same functionality: - Single shared implementation for git describe - Clear error messages when tags are missing - Works correctly from repo root where setuptools-scm runs
4dfd289 to
1b2e4c0
Compare
|
/ok to test |
setuptools-scm Shallow Clone Fallback Issue (exists already on
|
Add build-time validation to detect when setuptools-scm falls back to default versions (0.0.x or 0.1.dev*) due to shallow clones or missing git tags. This prevents silent failures that cause dependency conflicts. Changes: - scripts/validate_version.py: New DRY validation script that checks for fallback versions and validates against expected patterns - cuda_core/build_hooks.py: Add validation in prepare_metadata hooks - cuda_pathfinder/build_hooks.py: New build hooks with version validation - cuda_pathfinder/pyproject.toml: Use custom build_hooks backend - cuda_bindings/setup.py: Add ValidateVersion command class The validation runs after setuptools-scm generates _version.py files, ensuring we catch fallback versions before builds complete. This will cause the 18 Windows CI tests that currently use fallback versions to fail with clear error messages instead of silently using wrong versions. Related to shallow clone issue documented in PR NVIDIA#1454.
…etadata Move validation from prepare_metadata_for_build_* to build_editable/build_wheel where _version.py definitely exists. This fixes build failures where validation ran before setuptools-scm wrote the version file.
|
/ok to test |
|
/ok to test |
This commit adds version number validation tests to detect when automatic version detection fails and falls back to invalid versions (e.g., 0.0.x or 0.1.dev*). This addresses two main concerns: 1. Fallback version numbers going undetected: When setuptools-scm cannot detect version from git tags (e.g., due to shallow clones), it silently falls back to default versions like 0.1.dev*. These invalid versions can cause dependency conflicts and confusion in production/SWQA environments. 2. PyPI wheel replacement: The critical issue of just-built cuda-bindings being replaced with PyPI wheels is already handled by _check_cuda_bindings_installed() in cuda_core/build_hooks.py. Rather than attempting complex early detection in build hooks (which proved fragile due to timing issues with when _version.py files are written), we implement late-stage detection via test files. This approach is: - Simpler: No complex build hook timing issues - Reliable: Tests run after installation when versions are definitely available - Sufficient: Catches issues before they reach production/SWQA Changes: - Add validate_version_number() function to cuda_python_test_helpers for centralized validation logic - Create minimal test_version_number.py files in cuda_bindings, cuda_core, and cuda_pathfinder that import the version and call the validation function - Add helpers/__init__.py files in cuda_bindings/tests and cuda_pathfinder/tests to enable importing from cuda_python_test_helpers - Update cuda_core/tests/helpers/__init__.py to use ModuleNotFoundError instead of ImportError for consistency The validation checks that versions have major.minor > 0.1, which is sufficient since all three packages are already at higher versions. Error messages explain the issue without referencing setuptools-scm internals.
|
/ok to test |
The supports_ipc_mempool function was previously defined in cuda_python_test_helpers, but it had a hard dependency on cuda.core._utils.cuda_utils.handle_return. This caused CI failures when cuda_bindings or cuda_pathfinder tests tried to import cuda_python_test_helpers, because cuda-core might not be installed in those test environments. By moving supports_ipc_mempool to cuda_core/tests/helpers/__init__.py, we ensure that: - cuda_python_test_helpers remains free of cuda-core-specific dependencies - The function is only available where cuda-core is guaranteed to be installed (i.e., in cuda_core tests) - cuda_bindings and cuda_pathfinder can safely import cuda_python_test_helpers without requiring cuda-core Changes: - Move supports_ipc_mempool from cuda_python_test_helpers to cuda_core/tests/helpers/__init__.py - Update cuda_core/tests/test_memory.py to import from helpers instead of cuda_python_test_helpers - Remove unused imports (functools, Union, handle_return) from cuda_python_test_helpers/__init__.py - Remove supports_ipc_mempool from cuda_python_test_helpers __all__ This fixes CI failures where importing cuda_python_test_helpers would fail due to missing cuda-core dependencies.
|
/ok to test |
The main purpose of these tests is to validate that dependencies have valid version numbers, not just the package being tested. This is critical for catching cases where a dependency (e.g., cuda-pathfinder) might be built with a fallback version (0.1.dev...) due to shallow git clones or missing tags. To ensure we see all invalid versions in a single test run, we organize the tests as separate test functions (test_bindings_version, test_core_version, test_pathfinder_version) rather than combining them into a single function. This way, if multiple packages have invalid versions, pytest will report all failures rather than stopping at the first one. Changes: - cuda_bindings/tests/test_version_number.py: Tests both cuda-bindings and cuda-pathfinder versions - cuda_core/tests/test_version_number.py: Tests cuda-bindings, cuda-core, and cuda-pathfinder versions - cuda_pathfinder/tests/test_version_number.py: Tests cuda-pathfinder version (renamed function for consistency)
|
/ok to test |
Add minimal assertions immediately after importing __version__ in all three
package __init__.py files to fail fast if an invalid version (e.g., 0.1.dev...)
is detected. This prevents packages with fallback versions from being imported
or used, catching the issue at the earliest possible point.
The assertion checks that major.minor > (0, 1) using a minimal one-liner:
assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1), "FATAL: invalid __version__"
Strictly speaking this makes the unit tests redundant, but we want to keep
the unit tests as a second line of defense. The assertions provide immediate
feedback during import, while the unit tests provide explicit test coverage
and clearer error messages in CI logs.
Changes:
- cuda_bindings/cuda/bindings/__init__.py: Add version assertion
- cuda_core/cuda/core/__init__.py: Add version assertion
- cuda_pathfinder/cuda/pathfinder/__init__.py: Add version assertion
Use an intentionally shallow clone (fetch-depth: 1) to test wheel installation without full git history. This ensures we're testing the wheel artifacts themselves, not building from source. Changes: - Set fetch-depth: 1 explicitly (although it is the default) with comment emphasizing that shallow cloning is intentional - Add --only-binary=:all: to cuda-python installation to ensure we only test wheels, never build from source - Add "Verify installed package versions" step that imports cuda.pathfinder and cuda.bindings to trigger __version__ assertions immediately after installation, providing early detection of invalid versions - Update comments to accurately reflect that we're testing wheel artifacts This approach hardens the test workflows by: - Making the shallow clone intentional and explicit - Actually testing that __version__ assertions work (fail-fast on invalid versions) - Catching version issues immediately after installation, before tests run - Ensuring we only test wheels, not source builds Applied consistently to both: - .github/workflows/test-wheel-windows.yml - .github/workflows/test-wheel-linux.yml
__version__ validation: import-time assertions, unit tests, PyPI fallback prevention, CI hardening
|
/ok to test |
CI Logs Analysis: PR #1454 vs Main BranchSummaryAfter carefully analyzing the CI logs from PR #1454 (
Detailed Findings0.1.dev Versions Still AppearPR Logs: Found The same number of occurrences suggests this is a pre-existing issue, not something introduced by the PR. Where They Come FromThe bad versions are built during the "Install cuda.pathfinder extra wheels for testing" step. The issue is in the Windows workflow: Windows workflow ( pip install --only-binary=:all: -v . --group "test-cu${TEST_CUDA_MAJOR}"Linux workflow ( pip install --only-binary=:all: -v ./*.whl --group "test-cu${TEST_CUDA_MAJOR}"The Problem: When pip processes When pip processes What Happens NextThe sequence in affected Windows tests:
Version Tests StatusAll version tests pass in both PR and main branch logs:
This confirms that:
Comparison: PR vs Main
Key Difference: The PR has the "Verify installed package versions" step that explicitly tests the import-time assertions, providing early detection. The main branch doesn't have this step, but also doesn't fail because the bad versions get replaced before tests run. Remaining IssueWhile the bad versions are caught and replaced, they still cause:
Recommended FixChange the Windows workflow to match the Linux workflow: - pip install --only-binary=:all: -v . --group "test-cu${TEST_CUDA_MAJOR}"
+ pip install --only-binary=:all: -v ./*.whl --group "test-cu${TEST_CUDA_MAJOR}"This will prevent pip from building from source during metadata preparation, eliminating the temporary ConclusionThe PR's version validation mechanisms are working correctly:
The remaining |
Change pip install command from '.' to './*.whl' to prevent pip from building from source during metadata preparation. This matches the Linux workflow and eliminates the temporary 0.1.dev versions that were being built in shallow clones. See: NVIDIA#1454 (comment)
|
/ok to test |
CI Logs Analysis: PR #1454 (post commit c574a94)CI Run: Executive Summary✅ All checks passed successfully. The fix to use Key Findings1. No
|
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
| from cuda.bindings import utils | ||
| from cuda.bindings._version import __version__ | ||
|
|
||
| assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1), "FATAL: invalid __version__" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should do this outside of an opt-in debug mechanism. If someone grabbed the source code in some weird way that breaks the version resolution via setuptools_scm, we don't want it to error for them unnecessarily, where getting an "incorrect" version would be better than getting this assertion.
|
|
||
| import importlib | ||
|
|
||
| assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1), "FATAL: invalid __version__" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| @functools.cache | ||
| def supports_ipc_mempool(device_id: Union[int, object]) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to move this helper in this PR? It would ideally be done in a separate PR.
|
|
||
| from cuda.pathfinder._version import __version__ # isort: skip | ||
|
|
||
| assert tuple(int(_) for _ in __version__.split(".")[:2]) > (0, 1), "FATAL: invalid __version__" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
This PR adds safety mechanisms to prevent version-related issues during builds and ensures version detection from git tags is working as intended.
Version Number Validation
Problem:
setuptools-scmsilently falls back to version0.1.xwhen git tags are unavailable (e.g., due to shallow clones), which can lead to incorrect version detection and unexpected dependency resolution. Silent failures in the procedure producing__version__can lead to highly confusing behavior and potentially invalidate elaborate QA testing.Solution: We implement a two-layer defense strategy to catch invalid versions at multiple stages:
First Line of Defense: Import-Time Assertions
Fail-fast assertions are added immediately after importing
__version__in all three package__init__.pyfiles:cuda_bindings/cuda/bindings/__init__.pycuda_core/cuda/core/__init__.pycuda_pathfinder/cuda/pathfinder/__init__.pyEach file includes a minimal one-liner assertion:
This ensures that any attempt to import a package with an invalid version (e.g.,
0.1.dev...) fails immediately with a clear error, preventing the package from being used at all. The assertion checks thatmajor.minor > (0, 1), which is sufficient since all three packages are already at higher versions.Second Line of Defense: Unit Tests
As a backup, we also implement late-stage detection via regular unit tests that validate version numbers after installation:
validate_version_number()function tocuda_python_test_helpersfor centralized validation logictest_version_number.pyfiles incuda_bindings,cuda_core, andcuda_pathfinderthat import the version and call the validation functiontest_bindings_version,test_core_version,test_pathfinder_version) so all invalid versions are reported in a single test runcuda_bindingstests check bothcuda-bindingsandcuda-pathfinderversions)The unit tests run during the test phase and provide explicit test coverage with clearer error messages in CI logs.
Why Two Layers?
While the import-time assertions provide immediate feedback and prevent invalid packages from being imported, we maintain the unit tests as a second line of defense because:
__version__can lead to highly confusing behavior and potentially invalidate elaborate QA testing. Multiple detection points reduce the risk of invalid versions going undetected.CI Workflow Hardening
To ensure version validation works correctly in CI environments, we've hardened the test workflows:
Intentional Shallow Clone: Test workflows explicitly use
fetch-depth: 1(the default) with a comment emphasizing that shallow cloning is intentional. This ensures we're testing wheel installation without full git history, which is the correct behavior for testing pre-built artifacts.Wheel-Only Installation: The "Ensure cuda-python installable" step uses
--only-binary=:all:to ensure we only test wheels, never build from source. This prevents pip from building packages from source when source code is present, which could lead to version issues in shallow clones.Immediate Version Verification: After installing
cuda-python, a new "Verify installed package versions" step importscuda.pathfinderandcuda.bindingsto trigger the__version__assertions immediately. This provides early detection of invalid versions right after installation, before any tests run.These changes ensure that:
Why Not Early-Stage Detection?
We initially attempted early-stage detection in build hooks (
build_wheel,build_editable) to catch fallback versions during the build process. However, this approach proved too fragile:Timing Issues:
_version.pyfiles are written bysetuptools-scmduringprepare_metadata_for_build_wheel, but validation needs to run at the right time in the build process. Attempting to validate too early results in "file not found" errors, while validating too late allows builds to complete with invalid versions.Shallow Clone Handling: When
setuptools-scmdetects a shallow clone, it bypassesgit_describe_commandentirely and falls back to0.0or0.1.xversions before our validation can run. This makes build-time detection unreliable in CI environments that use shallow clones.Complexity: The build hook approach required careful coordination between PEP 517 hooks (
prepare_metadata_for_build_wheel,build_wheel) and custom validation logic, making it error-prone and difficult to maintain.Given these challenges, we decided to use the simpler and more certain approach: import-time assertions for immediate feedback, unit tests for explicit coverage, and CI workflow hardening to ensure the validation works correctly in all environments.
PyPI Fallback Prevention
Problem: During isolated (PEP 517) builds, a just-built
cuda-bindingsinstallation could be incorrectly replaced with a PyPI wheel if the installed version didn't match expectations.Solution: Enhanced
cuda_core/build_hooks.pyto:cuda-bindingsversion using direct import ofcuda.bindings._version(note thatimportlib.metadatacannot be used in isolated environments)_version.pyfile path is within the repository rootcuda-bindingsis installed (non-editable) and its major version doesn't match the CUDA major version, raise an exceptionThis prevents accidental installation of incompatible
cuda-bindingsversions from PyPI during builds.Piggy-backed:
Import Sorting Fix
Problem: Ruff's import sorting (
I001) was inconsistently reordering the_versionimport incuda_pathfinder/cuda/pathfinder/__init__.py, depending on whether_version.pyexists or not (e.g., aftergit clean -fdx).Solution: Added
# isort: skipdirective to the_versionimport line to prevent ruff from moving it. This ensures consistent import ordering regardless of build state.Note on setuptools-scm RuntimeWarning
We've observed
RuntimeWarnings fromsetuptools-scmthat incorrectly display package versions instead ofsetuptoolsversions (e.g.,ERROR: setuptools==0.5.1.dev20+gf8dddb370 is used in combination with setuptools-scm>=8.x). This appears to be a known issue insetuptools-scmwhen using custom build backends (see setuptools-scm issue #1192). A minimal reproducer has been created at github.com/rwgk/setuptools-scm-issue-1192.These warnings don't affect functionality but are noisy. They occur on both
mainand this branch.