Guidelines
Recommendations for correctly using and interpreting the results of Arm SPE profiling:
The bars showing the number of SPE samples for source code lines are for visually comparing the relative number of SPE samples between different lines of code within this profile. A full bar indicates that more events were sampled on a specific source code line than on the others.
Linaro MAP handles Arm SPE data in a time-agnostic manner: the numbers reported in the SPE Tables tab are from the entire program run, and from across all threads. Selecting a time range or using a Main Thread Only view mode does not change what is reported by the SPE Tables tab or the Arm SPE source code annotations.
To keep the impact of enabling Arm SPE profiling to reasonable levels, Linaro MAP only utilizes a subset of the samples taken by Arm SPE. Therefore, the number of SPE samples (hits) that are taken, depend on a number of factors that can vary between profiled applications, host machine configuration, and versions of Linaro MAP.
You can use Arm SPE profiling in conjunction with the Configurable Perf metrics feature. Linaro recommends that you enable the CPU instruction metrics which relate to the SPE filter you are using. For example, using the
branch-misses
CPU instruction metric with themispredict
Arm SPE filter. The CPU instruction metrics are both time-based and accurate counts, mitigating some of the limitations mentioned here.