Compile and Profile wave_openmp

Prerequisites

You must install all the necessary tools as described in Software requirements.

Procedure

  1. Compile the code. To compile the application, run make on the openmp.makefile file in the examples directory of the Linaro Forge installation.

    cd {installation-directory}/examples
    make -f openmp.makefile
    
  2. Profile the application with Linaro MAP. In this worked example, the application is run as a SLURM job using 8 nodes and 2 processes per node:

    salloc --nodes=8
    map srun --ntasks-per-node=2 ./wave_openmp
    

    Take note of the performance of the application from the job output:

    points/second: 162.3M (10.1M per process)
    
  3. Notice that the Thread affinity advisor tool button indicates the presence of thread affinity issues:

    ../../_images/map_thread_affinity_example_error_icon.png
  4. Open the Thread affinity advisor dialog. One of the Exemplar nodes is automatically selected.

    In this scenario, the SLURM job is run across 8 compute nodes. Each node is fitted with 2 CPU packages, associated with a single NUMA node each. Each CPU package comprises a single L3 cache shared between 4 physical cores. SMT is not in use.

    The Node topology viewer matches the expected hardware topology. The red background indicates that process bindings are overlapping:

    ../../_images/map_thread_affinity_example_topology_overlapping.png

    Inspect the Commentary to see the issues in greater detail:

    ../../_images/map_thread_affinity_example_commentary_overlapping.png

Next Steps

Optimize the application job with thread affinities shows you how to optimize the performance of the program by amending the thread affinities of the job.