CUDA memory debugging
There are two options for debugging memory errors in CUDA programs. They can be found in the CUDA section of the Run window.
See Prepare to debug CUDA GPU code before debugging the memory of a CUDA application.
When Track GPU allocations is enabled, CUDA memory
allocations made by the host are tracked. That is, allocations made using functions
such as cudaMalloc
. You can find out how much memory is allocated, and where it
was allocated from using the Current Memory Usage window.
Note
CUDA memory allocations made by the GPU/device such as cuMemAlloc
are currently not tracked
as well allocations made to Unified Memory with cudaMallocManaged
.
Furthermore, memory allocations made for CUDA arrays with functions such as cudaMallocArray
are not tracked.
Allocations are tracked separately for each GPU and the host. If you enable
Track GPU allocations, host-only memory
allocations made using functions such as malloc
will be tracked as well.
You can choose between GPUs using the drop-down list in the top-right corner
of the Memory Usage and Memory Statistics windows.
The Detect invalid accesses (memcheck) option switches on the
CUDA-MEMCHECK error detection tool. This tool can detect problems such as
out-of-bounds and misaligned global memory accesses, and syscall errors,
such as calling free ()
in a kernel on an already free’d pointer.
The other CUDA hardware exceptions (such as a stack overflow) are detected regardless of whether this option is selected or not.
Note
Detect invalid accesses (memcheck
) is not supported with
CUDA 12.
For further details about CUDA hardware exceptions, see the NVIDIA documentation.
Note
It is not possible to track GPU allocations created by an
OpenACC compiler because it does not directly call cudaMalloc
.