Compile and Profile wave_openmp
Prerequisites
You must install all the necessary tools as described in Software requirements.
Procedure
Compile the code. To compile the application, run make on the
openmp.makefile
file in the examples directory of the Linaro Forge installation.cd {installation-directory}/examples make -f openmp.makefile
Profile the application with Linaro MAP. In this worked example, the application is run as a SLURM job using 8 nodes and 2 processes per node:
salloc --nodes=8 map srun --ntasks-per-node=2 ./wave_openmp
Take note of the performance of the application from the job output:
points/second: 162.3M (10.1M per process)
Notice that the Thread affinity advisor tool button indicates the presence of thread affinity issues:
Open the Thread affinity advisor dialog. One of the Exemplar nodes is automatically selected.
In this scenario, the SLURM job is run across 8 compute nodes. Each node is fitted with 2 CPU packages, associated with a single NUMA node each. Each CPU package comprises a single L3 cache shared between 4 physical cores. SMT is not in use.
The Node topology viewer matches the expected hardware topology. The red background indicates that process bindings are overlapping:
Inspect the Commentary to see the issues in greater detail:
Next Steps
Optimize the application job with thread affinities shows you how to optimize the performance of the program by amending the thread affinities of the job.