Starting multi-process programs
If the progress bar does not report that at least process 0 has
connected, the remote forge-backend
daemons cannot be started or cannot connect to the GUI.
Sometimes problems are caused by environment variables not propagating to the remote nodes while starting a job. To a large extent, the solution to these problems depends on the MPI implementation that is being used.
Solution
If only one, or very few, processes connect, it might be because you have not chosen the correct MPI implementation. Examine the list and look carefully at the options. If you cannot find another suitable MPI, contact Forge Support.
If a large number of processes are reported by the status bar to have connected, it is possible that some have failed to start because of resource exhaustion, timing out, or, unusually, an unexplained crash.
To check for time-out problems, set the
FORGE_NO_TIMEOUT
environment variable to 1 before launching the GUI and see if further progress is made. This is not a solution, but aids the diagnosis. If all processes can start, contact Forge Support.