Known issues for AMD GPU profiling

There are a few known issues for AMD GPU profiling.

  • The environment variable HSA_ENABLE_INTERRUPT=0 is enabled by default when running HIP applications with Linaro MAP. This is to address an intermittent hang which has been observed. The environment variable FORGE_NO_HSA_INTERRUPT_ENABLE_0=1 restores the default behavior of HIP.

  • Line-level information and program counter (PC) sampling are not available for ROCm kernels.

  • GPU memory transfer analysis is not available for ROCm kernels.

  • HIP Kernels generated by offloaded OpenMP regions are not yet supported by Linaro MAP.

  • The graphs are scaled on the assumption that there is a 1:1 relationship between processes and GPUs, each process having exclusive use of its own AMD card. The graphs may be of an unexpected height if some processes do not have a GPU, or if multiple processes share the use of a common GPU.

  • GPU profiling is not supported when statically linking the Linaro Forge sampler library.

  • GPU profiling has been known to cause segfaults with ROCm 6.4 when profiling a program that calls hipMemcpyFromSymbol. This feature can be disabled by setting the environment variable FORGE_SAMPLER_DISABLE_ROCM_PROFILING=1.

  • HIP Kernel activity is known to not be recorded correctly with ROCm 6.3 onward if kernels were not synchronized using hipDeviceSynchronize. If your program uses hipEventSynchronize to synchronize kernels then consider converting these to use hipDeviceSynchronize instead.