NAME
orterun, mpirun, mpiexec - Execute serial and parallel jobs in Open
MPI.NNoottee:: mpirun, mpiexec, and orterun are all exact synonyms for each
other. Using any of the names will result in exactly identical behav-
ior.SYNOPSIS
Single Process Multiple Data (SPMD) Model: mmppiirruunn [ options ] <> [ ] Multiple Instruction Multiple Data (MIMD) Model: mmppiirruunn [ globaloptions ] [ localoptions1 ] < > [ ] : [ localoptions2 ] < > [ ] : ... : [ localoptionsN ] < > [ ] Note that in both models, invoking mpirun via an absolute path name is equivalent to specifying the -prefix option with a
value equiva- lent to the directory where mpirun resides, minus its last subdirec-
tory. For example:sshheellll$$ /usr/local/bin/mpirun ...
is equivalent tosshheellll$$ mpirun -prefix /usr/local
QQUUIICCKK SSUUMMMMAARRYYIf you are simply looking for how to run an MPI application, you proba-
bly want to use a command line of the following form:sshheellll$$ mpirun [ -np X ] [ -hostfile
] This will run X copies of
ment (if running under a supported resource manager, Open MPI's mpirun will usually automatically use the corresponding resource manager process starter, as opposed to, for example, rsh or ssh, which require the use of a hostfile, or will default to running all X copies on thein your current run-time environ- localhost), scheduling (by default) in a round-robin fashion by CPU
slot. See the rest of this page for more details. OOPPTTIIOONNSS mpirun will send the name of the directory where it was invoked on the local node to each of the remote nodes, and attempt to change to thatdirectory. See the "Current Working Directory" section below for fur-
ther details.<
> Pass these run-time arguments to every new process. These must always be the last arguments to mpirun. If an app con-
text file is used,will be ignored. <
recognized argument to mpirun.> The program executable. This is identified as the first non- -aabboorrtteedd, --aabboorrtteedd <#>
Set the maximum number of aborted processes to display.--aapppp
Provide an appfile, ignoring all other command line options.-bbyynnooddee, --bbyynnooddee
Allocate (map) the processes by node in a round-robin scheme.
-bbyysslloott, --bbyysslloott
Allocate (map) the processes by slot in a round-robin scheme.
This is the default.-cc <#> Synonym for -np.
-ddeebbuugg, --ddeebbuugg
Invoke the user-level debugger indicated by the
ortebaseuserdebugger MCA parameter.-ddeebbuuggggeerr, --ddeebbuuggggeerr
Sequence of debuggers to search for when -debug is used
(i.e. a synonym for ortebaseuserdebugger MCA parameter).-ggmmccaa, --ggmmccaa
Pass global MCA parameters that are applicable to all con-
texts.is the parameter name; is the parameter value. -hh, --hheellpp
Display help for this command-HH
Synonym for -host.
-hhoosstt, --hhoosstt
List of hosts on which to invoke processes.-hhoossttffiillee, --hhoossttffiillee
Provide a hostfile to use.-mmaacchhiinneeffiillee, --mmaacchhiinneeffiillee
Synonym for -hostfile.
-mmccaa, --mmccaa <
> < > Send arguments to various MCA modules. See the "MCA" sec-
tion, below.-nn, --nn <#>
Synonym for -np.
-nnoollooccaall, --nnoollooccaall
Do not run any copies of the launched application on the samenode as orterun is running. This option will override list-
ing the localhost with --hhoosstt or any other host-specifying
mechanism.-nnoooovveerrssuubbssccrriibbee, --nnoooovveerrssuubbssccrriibbee
Do not oversubscribe any nodes; error (without starting any processes) if the requested number of processes would cause oversubscription. This option implicitly sets "maxslots" equal to the "slots" value for each node.-nnpp <#> Run this many copies of the program on the given nodes. This
option indicates that the specified file is an executableprogram and not an application context. If no value is pro-
vided for the number of copies to execute (i.e., neither the"-np" nor its synonyms are provided on the command line),
Open MPI will automatically execute a copy of the program on each process slot (see below for description of a "process slot"). This feature, however, can only be used in the SPMD model and will return an error (without beginning execution of the application) otherwise.-nnww, --nnww Launch the processes and do not wait for their completion.
mpirun will complete as soon as successful launch occurs.-ppaatthh, --ppaatthh
that will be used when attempting to locate requested executables. --pprreeffiixx
Prefix directory that will be used to set the PATH and LDLIBRARYPATH on the remote node before invoking Open MPI or the target process. See the "Remote Execution" section, below. -qq, --qquuiieett
Suppress informative messages from orterun during application
execution.--ttmmppddiirr
Set the root for the session directory tree for mpirun only. -ttvv, --ttvv Launch processes under the TotalView debugger. Deprecated
backwards compatibility flag. Synonym for -debug.
--uunniivveerrssee
For this application, set the universe name as: username@hostname:universename-vv, --vveerrbboossee
Be verbose-VV, --vveerrssiioonn
Print version number. If no other arguments are given, thiswill also cause orterun to exit.
-wwdd
Synonym for -wdir. -wwddiirr
Change to the directory
before the user's program exe- cutes. See the "Current Working Directory" section for notes on relative paths. NNoottee:: If the -wdir option appears both on
the command line and in an application context, the context will take precedence over the command line.-xx
nodes before executing the program. Existing environment variables can be specified (see the Examples section, below), or new variable names specified with corresponding values.Export the specified environment variables to the remote The parser for the -x option is not very sophisticated; it
does not even understand quoted values. Users are advised toset variables in the environment, and then use -x to export
(not define) them. The following options are useful for developers; they are not generally useful to most ORTE and/or MPI users:-dd, --ddeebbuugg-ddeevveell
Enable debugging of the OpenRTE (the run-time layer in Open
MPI). This is not generally useful for most users.--ddeebbuugg-ddaaeemmoonnss
Enable debugging of any OpenRTE daemons used by this applica-
tion.--ddeebbuugg-ddaaeemmoonnss-ffiillee
Enable debugging of any OpenRTE daemons used by this applica-
tion, storing output in files.--nnoo-ddaaeemmoonniizzee
Do not detach OpenRTE daemons used by this application.DESCRIPTION
One invocation of mpirun starts an MPI application running under Open MPI. If the application is single process multiple data (SPMD), the application can be specified on the mpirun command line.If the application is multiple instruction multiple data (MIMD), com-
prising of multiple programs, the set of programs and argument can be specified in one of two ways: Extended Command Line Arguments, and Application Context. An application context describes the MIMD program set including all arguments in a separate file. This file essentially contains multiple mpirun command lines, less the command name itself. The ability to specify different options for different instantiations of a program is another reason to use an application context.Extended command line arguments allow for the description of the appli-
cation layout on the command line using colons (:) to separate the specification of programs and arguments. Some options are globally setacross all specified programs (e.g. -hostfile), while others are spe-
cific to a single program (e.g. -np).
PPrroocceessss SSlloottss Open MPI uses "slots" to represent a potential location for a process. Hence, a node with 2 slots means that 2 processes can be launched on that node. For performance, the community typically equates a "slot" with a physical CPU, thus ensuring that any process assigned to that slot has a dedicated processor. This is not, however, a requirement for the operation of Open MPI. Slots can be specified in hostfiles after the hostname. For example: host1.example.com slots=4 Indicates that there are 4 process slots on host1. If no slots value is specified, then Open MPI will automatically assign a default value of "slots=1" to that host. When running under resource managers (e.g., SLURM, Torque, etc.), Open MPI will obtain both the hostnames and the number of slots directly from the resource manger. For example, if running under a SLURM job, Open MPI will automatically receive the hosts that SLURM has allocated to the job as well as how many slots on each node that SLURM says areusable - in most high-performance environments, the slots will equate
to the number of processors on the node. When deciding where to launch processes, Open MPI will first fill upall available slots before oversubscribing (see "Location Nomencla-
ture", below, for more details on the scheduling algorithms available). Unless told otherwise, Open MPI will arbitrarily oversubscribe nodes. For example, if the only node available is the localhost, Open MPI willrun as many processes as specified by the -n (or one of its variants)
command line option on the localhost (although they may run quite slowly, since they'll all be competing for CPU and other resources). Limits can be placed on oversubscription with the "maxslots" attribute in the hostfile. For example: host2.example.com slots=4 maxslots=6 Indicates that there are 4 process slots on host2. Further, Open MPI is limited to launching a maximum of 6 processes on host2. host3.example.com slots=2 maxslots=2Indicates that there are 2 process slots on host3 and that no over-
subscription is allowed (similar to the -nooversubscribe option).
host4.example.com maxslots=2 Shorthand; same as listing "slots=2 maxslots=2". Note that Open MPI's support for resource managers does not currentlyset the "maxslots" values for hosts. If you wish to prevent oversub-
scription in such scenarios, use the -nooversubscribe option.
In scenarios where the user wishes to launch an application across allavailable slots by not providing a "-n" option on the mpirun command
line, Open MPI will launch a process on each process slot for each host within the provided environment. For example, if a hostfile has been provided, then Open MPI will spawn processes on each identified host upto the "slots=x" limit if oversubscription is not allowed. If oversub-
scription is allowed (the default), then Open MPI will spawn processes on each host up to the "maxslots=y" limit if that value is provided.In all cases, the "-bynode" and "-byslot" mapping directives will be
enforced to ensure proper placement of process ranks. LLooccaattiioonn NNoommeennccllaattuurreeAs described above, mpirun can specify arbitrary locations in the cur-
rent Open MPI universe. Locations can be specified either by CPU or by node. NNoottee:: This nomenclature does not force Open MPI to bind processes toCPUs - specifying a location "by CPU" is really a convenience mecha-
nism for SMPs that ultimately maps down to a specific node. Specifying locations by node will launch one copy of an executable perspecified node. Using the -bynode option tells Open MPI to use all
available nodes. Using the -byslot option tells Open MPI to use all
slots on an available node before allocating resources on the next available node. For example:mpirun -bynode -np 4 a.out
Runs one copy of the the executable a.out on all available nodes in the Open MPI universe. MPICOMMWORLD rank 0 will be on node0, rank 1 will be on node1, etc. Regardless of how many slots are available on each of the nodes.mpirun -byslot -np 4 a.out
Runs one copy of the the executable a.out on each slot on a given node before running the executable on other available nodes. SSppeecciiffyyiinngg HHoossttss Hosts can be specified in a number of ways. The most common of which is in asshheellll$$ cat my-hostfile
node00 slots=2 node01 slots=2 node02 slots=2mpirun -hostfile my-hostfile -np 3 a.out
This will run one copy of the executable a.out on hosts node00,node01, and node02. Another method for specifying hosts is directly on the command line. Here can can include and exclude hosts from the set of hosts to run on. For example:mpirun -np 3 -host a a.out
Runs three copies of the executable a.out on host a.mpirun -np 3 -host a,b,c a.out
Runs one copy of the executable a.out on hosts a, b, and c.mpirun -np 3 -hostfile my-hostfile -host node00 a.out
Runs three copies of the executable a.out on host node00.mpirun -np 3 -hostfile my-hostfile -host node10 a.out
This will prompt an error since node10 is not in my-hostfile;
mpirun will abort.shell$ mpirun -np 1 -host a hostname : -np 2 -host b,c uptime
Runs one copy of the executable hostname on host a. And runs one copy of the executable uptime on hosts b and c. NNoo LLooccaall LLaauunncchhUsing the --nnoollooccaall option to orterun tells the system to not launch
any of the application processes on the same node that orterun is run-
ning. While orterun typically blocks and consumes few system
resources, this option can be helpful for launching very large jobswhere orterun may actually need to use noticable amounts of memory
and/or processing time. --nnoollooccaall allows orteun to run without sharing
the local node with the launched applications, and likewise allows thelaunched applications to run unhindered by orterun's system usage.
Note that --nnoollooccaall will override any other specification to launch the
application on the local node. It will disqualify the localhost from being capable of running any processes in the application.shell$ mpirun -np 1 -host localhost -nolocal hostname
This example will result in an error because orterun will not
find anywhere to launch the application. NNoo OOvveerrssuubbssccrriippttiioonnUsing the -nooversubscribe option causes Open MPI to implicitly set
the "maxslots" value to be the same as the "slots" value for each node. This can be especially helpful when running jobs under a resource manager because Open MPI currently only sets the "slots" value for each node that it obtains from the resource manager. AApppplliiccaattiioonn CCoonntteexxtt oorr EExxeeccuuttaabbllee PPrrooggrraamm?? To distinguish the two different forms, mpirun looks on the commandline for -app option. If it is specified, then the file named on the
command line is assumed to be an application context. If it is not specified, then the file is assumed to be an executable program. LLooccaattiinngg FFiilleess If no relative or absolute path is specified for a file, Open MPI willlook for files by searching the directories in the user's PATH environ-
ment variable as defined on the source node(s).If a relative directory is specified, it must be relative to the ini-
tial working directory determined by the specific starter used. For example when using the rsh or ssh starters, the initial directory is$HOME by default. Other starters may set the initial directory to the
current working directory from the invocation of mpirun. CCuurrrreenntt WWoorrkkiinngg DDiirreeccttoorryyThe -wdir mpirun option (and its synonym, -wd) allows the user to
change to an arbitrary directory before the program is invoked. It canalso be used in application context files to specify working directo-
ries on specific nodes and/or for specific applications.If the -wdir option appears both in a context file and on the command
line, the context file directory will override the command line value.If the -wdir option is specified, Open MPI will attempt to change to
the specified directory on all of the remote nodes. If this fails, mpirun will abort.If the -wdir option is nnoott specified, Open MPI will send the directory
name where mpirun was invoked to each of the remote nodes. The remote nodes will try to change to that directory. If they are unable (e.g., if the directory does not exit on that node), then Open MPI will use the default directory determined by the starter. All directory changing occurs before the user's program is invoked; it does not wait until MPIINIT is called. SSttaannddaarrdd II//OO Open MPI directs UNIX standard input to /dev/null on all processes except the MPICOMMWORLD rank 0 process. The MPICOMMWORLD rank 0 process inherits standard input from mpirun. NNoottee:: The node that invoked mpirun need not be the same as the node where the MPICOMMWORLD rank 0 process resides. Open MPI handles the redirection of mpirun's standard input to the rank 0 process. Open MPI directs UNIX standard output and error from remote nodes to the node that invoked mpirun and prints it on the standard output/error of mpirun. Local processes inherit the standard output/error of mpirun and transfer to it directly. Thus it is possible to redirect standard I/O for Open MPI applications by using the typical shell redirection procedure on mpirun.sshheellll$$ mpirun -np 2 myapp < myinput > myoutput
Note that in this example only the MPICOMMWORLD rank 0 process will receive the stream from myinput on stdin. The stdin on all the other nodes will be tied to /dev/null. However, the stdout from all nodes will be collected into the myoutput file. SSiiggnnaall PPrrooppaaggaattiioonnWhen orterun receives a SIGTERM and SIGINT, it will attempt to kill the
entire job by sending all processes in the job a SIGTERM, waiting a small number of seconds, then sending all processes in the job aSIGKILL. SIGUSR1 and SIGUSR2 signals received by orterun are propa-
gated to all processes in the job. Other signals are not currentlypropagated by orterun.
PPrroocceessss TTeerrmmiinnaattiioonn // SSiiggnnaall HHaannddlliinngg During the run of an MPI application, if any rank dies abnormally (either exiting before invoking MPIFINALIZE, or dying as the result of a signal), mpirun will print out an error message and kill the rest of the MPI application. User signal handlers should probably avoid trying to cleanup MPI state(Open MPI is, currently, neither thread-safe nor async-signal-safe).
For example, if a segmentation fault occurs in MPISEND (perhaps because a bad buffer was passed in) and a user signal handler is invoked, if this user handler attempts to invoke MPIFINALIZE, Bad Things could happen since Open MPI was already "in" MPI when the erroroccurred. Since mpirun will notice that the process died due to a sig-
nal, it is probably not necessary (and safest) for the user to onlyclean up non-MPI state.
PPrroocceessss EEnnvviirroonnmmeenntt Processes in the MPI application inherit their environment from theOpen RTE daemon upon the node on which they are running. The environ-
ment is typically inherited from the user's shell. On remote nodes, the exact environment is determined by the boot MCA module used. The rsh launch module, for example, uses either rsh/ssh to launch the Open RTE daemon on remote nodes, and typically executes one or more of theuser's shell-setup files before launching the Open RTE daemon. When
running dynamically linked applications which require the LDLIBRARYPATH environment variable to be set, care must be taken to ensure that it is correctly set when booting Open MPI. See the "Remote Execution" section for more details. RReemmoottee EExxeeccuuttiioonn Open MPI requires that the PATH environment variable be set to findexecutables on remote nodes (this is typically only necessary in rsh-
or ssh-based environments - batch/scheduled environments typically
copy the current environment to the execution of remote jobs, so if the current environment has PATH and/or LDLIBRARYPATH set properly, the remote nodes will also have it set properly). If Open MPI was compiled with shared library support, it may also be necessary to have theLDLIBRARYPATH environment variable set on remote nodes as well (espe-
cially to find the shared libraries required to run user MPI applica-
tions). However, it is not always desirable or possible to edit shell startupfiles to set PATH and/or LDLIBRARYPATH. The -prefix option is pro-
vided for some simple configurations where this is not possible.The -prefix option takes a single argument: the base directory on the
remote node where Open MPI is installed. Open MPI will use this direc-
tory to set the remote PATH and LDLIBRARYPATH before executing anyOpen MPI or user applications. This allows running Open MPI jobs with-
out having pre-configued the PATH and LDLIBRARYPATH on the remote
nodes.Open MPI adds the basename of the current node's "bindir" (the direc-
tory where Open MPI's executables are installed) to the prefix and uses that to set the PATH on the remote node. Similarly, Open MPI adds the basename of the current node's "libdir" (the directory where Open MPI's libraries are installed) to the prefix and uses that to set the LDLIBRARYPATH on the remote node. For example: Local bindir: /local/node/directory/bin Local libdir: /local/node/directory/lib64 If the following command line is used:sshheellll$$ mpirun -prefix /remote/node/directory
Open MPI will add "/remote/node/directory/bin" to the PATH and "/remote/node/directory/lib64" to the DLIBRARYPATH on the remote node before attempting to execute anything.Note that -prefix can be set on a per-context basis, allowing for dif-
ferent values for different nodes.The -prefix option is not sufficient if the installation paths on the
remote node are different than the local node (e.g., if "/lib" is used on the local node, but "/lib64" is used on the remote node), or if theinstallation paths are something other than a subdirectory under a com-
mon prefix. Note that executing mpirun via an absolute pathname is equivalent tospecifying -prefix without the last subdirectory in the absolute path-
name to mpirun. For example:sshheellll$$ /usr/local/bin/mpirun ...
is equivalent tosshheellll$$ mpirun -prefix /usr/local
EExxppoorrtteedd EEnnvviirroonnmmeenntt VVaarriiaabblleessAll environment variables that are named in the form OMPI* will auto-
matically be exported to new processes on the local and remote nodes.The -x option to mpirun can be used to export specific environment
variables to the new processes. While the syntax of the -x option
allows the definition of new variables, note that the parser for thisoption is currently not very sophisticated - it does not even under-
stand quoted values. Users are advised to set variables in the envi-
ronment and use -x to export them; not to define them.
MMCCAA ((MMoodduullaarr CCoommppoonneenntt AArrcchhiitteeccttuurree))The -mca switch allows the passing of parameters to various MCA mod-
ules. MCA modules have direct impact on MPI programs because theyallow tunable parameters to be set at run time (such as which BTL com-
munication device driver to use, what parameters to pass to that BTL, etc.).The -mca switch takes two arguments:
argument generally specifies which MCA module will receive the value. For example, theand . The "btl" is used to select which BTL to be used for transporting MPI messages. The argument is the value that is passed. For example: mpirun -mca btl tcp,self -np 1 foo
Tells Open MPI to use the "tcp" and "self" BTLs, and to run a sin-
gle copy of "foo" an allocated node.mpirun -mca btl self -np 1 foo
Tells Open MPI to use the "self" BTL, and to run a single copy of "foo" an allocated node.The -mca switch can be used multiple times to specify different
and/orarguments. If the same is specified more than once, the s are concatenated with a comma (",") separating them. NNoottee:: The -mca switch is simply a shortcut for setting environment
variables. The same effect may be accomplished by setting correspond-
ing environment variables before running mpirun. The form of the envi-
ronment variables that Open MPI sets are: OMPI= Note that the -mca switch overrides any previously set environment
variables. Also note that unknownarguments are still set as environment variable - they are not checked (by mpirun) for correct-
ness. Illegal or incorrectarguments may or may not be reported - it depends on the specific MCA module.
EEXXAAMMPPLLEESSBe sure to also see the examples in the "Location Nomenclature" sec-
tion, above.mpirun -np 1 prog1
Load and execute prog1 on one node. Search the user's $PATH for
the executable file on each node.mpirun -np 8 -byslot prog1
Run 8 copies of prog1 wherever Open MPI wants to run them.mpirun -np 4 -mca btl ib,tcp,self prog1
Run 4 copies of prog1 using the "ib", "tcp", and "self" BTL's for the transport of MPI messages. RREETTUURRNN VVAALLUUEE mpirun returns 0 if all ranks started by mpirun exit after callingMPIFINALIZE. A non-zero value is returned if an internal error
occurred in mpirun, or one or more ranks exited before callingMPIFINALIZE. If an internal error occurred in mpirun, the correspond-
ing error code is returned. In the event that one or more ranks exit before calling MPIFINALIZE, the return value of the rank of the process that mpirun first notices died before calling MPIFINALIZE will be returned. Note that, in general, this will be the first rank that died but is not guaranteed to be so.However, note that if the -nw switch is used, the return value from
mpirun does not indicate the exit status of the ranks. Open MPI March 2006 MPIRUN(1)