Control GPU threads

To control GPU threads use the standard play, pause, and breakpoints controls. They are all applicable to GPU kernels.

However, because GPUs have different execution models to CPUs, there are a few behavioral differences that are described below.

GPU breakpoints

GPU breakpoints can be set in the same way as other breakpoints. See Set breakpoints.

Where a kernel pauses at a breakpoint, the currently selected GPU thread will be changed if the previously selected thread is no longer ‘alive’.

For more information about NVIDIA GPU breakpoint handling, see NVIDIA GPU Breakpoints.

For more information about AMD GPU breakpoints handling, see AMD GPU Breakpoints.

For more information about Intel Xe GPU breakpoints handling, see Intel Xe GPU Breakpoints.

Stepping

The GPU execution model is noticeably different to that of the host CPU. In the context of stepping operations, that is, step in, step over, or step out, there are critical differences to note.

NVIDIA

The smallest execution unit on a NVIDIA GPU is a warp, which on current GPUs is 32 threads. All threads in a warp execute in lockstep, which means that you cannot step each thread individually. All active threads in the warp execute step at the same time.

It is not currently possible to step over or step out of inlined GPU functions.

Note

NVIDIA GPU functions are often inlined by the compiler. This can be avoided (dependent on hardware) by specifying the __noinline__ keyword in your function declaration.

AMD

The smallest execution unit on an AMD GPU is a wavefront, which on current GPUs is 64 threads. All threads in a wavefront execute in lockstep, which means that you cannot step each thread individually. All active threads in the wavefront execute step at the same time.

Intel Xe

The smallest execution unit on an Intel Xe GPU is a sub-group, which on current GPUs is typically 8, 16 or 32 threads. All threads in a sub-group execute in lockstep, which means that you cannot step each thread individually. All active threads in the sub-group execute step at the same time.

Running and pausing

Click Play/Continue to run all GPU threads. It is not possible to run individual blocks, warps, or threads (NVIDIA) or workgroups, wavefronts, or threads (AMD), or workgroups, sub-groups or threads (Intel Xe).

Click Pause to pause a running kernel. Note that the pause operation is not as quick for GPUs as for regular CPUs.