Skip to content
This repository was archived by the owner on Jun 9, 2023. It is now read-only.

Conversation

@maxhgerlach
Copy link

Currently, it is not possible to add NVTX tracing to ops that may be executed on CPU rather than GPU. In that case one would run into exceptions like
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation .../NvtxStart: Could not satisfy explicit device specification '/device:CPU:0' because no supported kernel for CPU devices is available.

This can be fixed in a straight-forward manner by registering CPU kernels for NvtxStart and NvtxEnd. As far as I can tell, NVTX tracing should work fine for non-CUDA code, so this feels generally useful to me.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant