Profiling settings
Issue
Your profiling session exhausts the available free memory on a node while Linaro MAP or Linaro Performance Reports is launching, profiling or analyzing the results when using Open MPI 4+ (excluding Open MPI (Compatibility)), MPICH 4, or Cray MPICH under SLURM (MPMD).
Solution
FORGE_MULTIPLEXER_RATIO can be used to trade increased time overhead during
launch and analysis for reduced memory overhead. This environment variable
accepts a range of values depending on how much memory overhead needs reducing.
The value takes the form N:M. This is the ratio of Linaro MAP worker processes
(N) to user processes (M). Examples are as follows.
FORGE_MULTIPLEXER_RATIO=1:2results in one Linaro MAP worker process per two ranks.This will provide the most memory reduction for the least increase in time overhead.
FORGE_MULTIPLEXER_RATIO=1:3results in one Linaro MAP worker process per three ranks.This will further reduce memory overhead for a further increase in time overhead.
Additionally, FORGE_MULTIPLEXER_RATIO=0 results in one Linaro MAP worker process
for all ranks. This provides the most memory reduction with the largest
increase in time overhead.
Issue
Your program appears to run correctly during profiling, but runs out of memory while Linaro MAP or Linaro Performance Reports collects the results.
Solution
Define FORGE_REDUCE_MEMORY_USAGE=1 in your environment and then rerun Linaro MAP.
This environment variable causes the results for each process on a node
to be processed sequentially instead of processed in parallel.
This reduces the amount of free memory needed on each node, but takes longer to complete.