Skip to content

Latest commit

 

History

History
24 lines (14 loc) · 787 Bytes

File metadata and controls

24 lines (14 loc) · 787 Bytes

#NVTX_example

Recent versions of the NVIDIA HPC SDK provide a built-in module that is equivalent to the one in this repro. You just need to load the module nvtx in your source (use nvtx) and link with -cudalib=nvtx

How to call NVTX from Fortran:

https://devblogs.nvidia.com/customize-cuda-fortran-profiling-nvtx/

Updated code, it will work with PGI, XLF and Gfortran compilers.

If you are using nvprof from CUDA 10.x and you are getting an error: ======== Error: incompatible CUDA driver version.

you will need to add "--openmp-profiling off" to the command line.

NVTX markers from Fortran work with the new Nsight tools shipped in CUDA 10.1 Update 2.

To capture a trace run: nsys profile -o outputfile ./a.out

To visualize the trace: nsight-sys full_path_to_outputfile