-
When a MPI application is initialized by calling MPI_Init function,
the library will set up an initial default communicator, MPI_COMM_WORLD,
which all processes are a number of.
Say on falcon6, you run a MPI program as
falcon6 will look at a file:
/usr/local/mpi/util/machines/machines.alpha
to obtain the other 3 hosts.
Such a machines.alpha file is shown below:
% more machines.alpha
falcon2.cz3.nus.edu.sg
falcon3.cz3.nus.edu.sg
falcon4.cz3.nus.edu.sg
............
falcon20.cz3.nus.edu.sg
falcon6 will use itself, falcon2, falcon3, and falcon4 to set up
MPI_COMM_WORLD with a group of 4 processes.
Thus if all of you run a MPI job like this, it will end up with overloading
in falcon2, falcon3, and falcon4. In order to avoid this, you can copy:
/usr/local/mpi/util/machines/machines.alpha
to your own directory, and randomize the machine order such as:
falcon10.cz3.nus.edu.sg
falcon7.cz3.nus.edu.sg
falcon4.cz3.nus.edu.sg
............
falcon5.cz3.nus.edu.sg
Then run job as
mpirun -machinefile machines.alpha -np 4 mpi_prog1
\% more machines.alpha \\
-
Like other communicators, MPI_COMM_WORLD is represented by an opaque
communication object, which is invisible
to user and can only be accessed by specific accessors.
MPI_Comm_size(MPI_Comm MPI_COMM_WORLD,int
*size)
MPI_Comm_rank(MPI_Comm MPI_COMM_WORLD,int *rank)
MPI_Comm_group(MPI_Comm MPI_COMM_WORLD, MPI_Group *group)
The last accessor gets from MPI_COMM_WORLD a pointer to an opaque
group object, which in turn can only be accessed by its accessors,
such as
MPI_Group_size(MPI_Group group,int *size)