Known issues and limitations

Environment

AMD GPU Debugging support in Linaro DDT requires rocgdb to be detectable in your environment. This rocgdb must be compatible with the AMD GPU Driver installed on the system.

Linaro DDT will fail to start if ROCm is selected in the Run Dialog and rocgdb is not detected in the environment. rocgdb is available in the standard ROCm Toolkit installation.

Linaro DDT will fail to start if the ROCm environment is incompatible with the installed AMD GPU Driver. More information about the error messages observed in this case can be found on AMD’s rocgdb documentation, see Debugging with ROCgdb.

In the event where these environment errors cannot be resolved, you can still debug your application with the command-line option --no-rocm or deselecting ROCm in the Run Dialog. However, AMD GPU Debugging will not be possible.

Please contact Forge Support if you encounter an issue.

Limitations

This section provides information about known issues and limitations with ROCm debugging.

  • AMD GPU core files cannot be opened in the same way as core files generated by CPU code.

  • By default, Linaro DDT gathers information about all lanes of each AMD GPU wavefront to show the correct number of GPU threads in the Parallel Stack View, or to allow you to select the 3D thread index in the GPU Thread Selector.

    This might cause performance issues. To only gather information at the wavefront level, use the environment variable FORGE_ROCM_LANES=0. With this environment variable set, the Parallel Stack View only shows the number of wavefronts for GPU Threads. The 3D thread index selector is one dimensional and only allow you to switch between wavefronts.

  • Switching to inactive lanes is currently disabled. If you attempt to switch to an inactive lane, an error message displays.

  • To see dispatches started prior to attaching, start the user application with the environment variable HSA_ENABLE_DEBUG=1.

  • Symbolic debugging is only available with ROCm 5.1 and later and the AMD AFAR compiler.

  • GPU Device watchpoints are only supported for global memory.

  • The HIP runtime currently performs deferred code object loading by default. This will result in conditional breakpoints not being hit if set before the first kernel is launched. To set conditional breakpoints before the first kernel is launched, start the user application with the environment variable HIP_ENABLE_DEFERRED_LOADING=0.

  • There may be issues with dereferencing STL pointers with ROCm 5.1. Please contact Forge Support if you encounter this issue.

Additionally, Host-side debugging limitations lists the differences that may be expected in host-side debugging when GPU debugging support is enabled.