Skip to content

Migrate NvmlGpuInfo from pynvml to cuda.core.system after 1.0 release #147

@mmccarty

Description

@mmccarty

Context

PR #137 introduced NvmlGpuInfo in hardware.py which wraps raw pynvml calls behind a GpuInfoProvider Protocol. Reviewers (@ncclementi, @mdboom) noted we should migrate from pynvml to cuda.core.system for consistency with the rest of the CUDA Python ecosystem.

As of recently, cuda.core.system has all the APIs needed (device count, compute capability, memory info, driver version). However, this requires waiting for the cuda-core 1.0 release before merging.

Related

Work needed

Once cuda-core 1.0 is released:

  1. Update NvmlGpuInfo in rapids_cli/hardware.py to use cuda.core.system instead of pynvml
  2. Update HardwareInfoError wrapping to catch cuda.core exceptions instead of pynvml.NVMLError
  3. Remove nvidia-ml-py (pynvml) from dependencies.yaml if fully replaced
  4. Update tests in test_hardware.py to mock cuda.core instead of pynvml

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions