Debug OpenMP programs

When you run an OpenMP program, set the Number of OpenMP threads value to the number of threads you require. Linaro DDT will run your program with the OMP_NUM_THREADS environment variable set to the appropriate value.

There are several important points to keep in mind when you debug OpenMP programs:

  • Parallel regions created with #pragma omp parallel (C) or !$OMP PARALLEL (Fortran) will usually not be nested in the Parallel Stack View under the function that contained the #pragma. Instead they will appear under a different top-level item. The top-level item is often in the OpenMP runtime code, and the parallel region appears several levels down in the tree.

  • Some OpenMP libraries only create the threads when the first parallel region is reached. It is possible you may only see one thread at the start of the program.

  • You cannot step into a parallel region. Instead, select Step threads together and use the Run to here command to synchronize the threads at a point inside the region. These controls are discussed in more detail in their own sections of this document.

  • You cannot step out of a parallel region. Instead, use the Run to here command to leave it. Most OpenMP libraries work best if you keep Step threads together selected until you have left the parallel region. With the Intel OpenMP library, this means you will see the Stepping Threads window and will have to click Skip All once.

  • Leave Step threads together clear when you are outside a parallel region, as OpenMP worker threads usually do not follow the same program flow as the main thread.

  • To control threads individually, use Focus on Thread. This allows you to step and play one thread without affecting the rest. This is helpful when you want to work through a locking situation or to bring a stray thread back to a common point. The Focus controls are discussed in more detail in their own section of this document.

  • Shared OpenMP variables may appear twice in the Locals window. This is one of the many unfortunate side-effects of the complex way OpenMP libraries interfere with your code to produce parallelism. One copy of the variable may have a nonsense value, this is usually easy to recognize. The correct values are shown in the Evaluate and Current Line windows.

  • Parallel regions may be displayed as a new function in the Stacks view. Many OpenMP libraries implement parallel regions as automatically-generated outline functions, and Linaro DDT shows you this. To view the value of variables that are not used in the parallel region, you may need to switch to thread 0 and change the stack frame to the function you wrote, rather than the outline function.

  • Stepping often behaves unexpectedly inside parallel regions. Reduction variables usually require some sort of locking between threads, and may even appear to make the current line jump back to the start of the parallel region. If this happens step over several times and you will see the current line comes back to the correct location.

  • Some compilers optimize parallel loops regardless of the options you specified on the command line. This has many strange effects, including code that appears to move backwards as well as forwards, and variables that are not displayed or have nonsense values because they have been optimized out by the compiler.

  • The thread IDs displayed in the Process Group Viewer and Cross-Thread Comparison window will match the value returned by omp_get_thread_num() for each thread, but only if your OpenMP implementation exposes this data to Linaro DDT. GCC’s support for OpenMP (GOMP) needs to be built with TLS enabled with our thread IDs to match the return omp_get_thread_num(), whereas your system GCC most likely has this option disabled. The same thread IDs will be displayed as tooltips for the threads in the thread viewer, but only your OpenMP implementation exposes this data.

If you are using Linaro DDT with OpenMP and would like to tell us about your experiences, please contact Forge Support, with the subject title OpenMP feedback.