by Christian Seberino (cseberino@ucsd.edu) May 9, 1996.
What is MPI? It is a portable efficient standard for writing message passing parallel programs. This includes shared memory machines. It is not any specific product, but rather, a standard. For example, MPICH is an implementation of MPI widely used. MPI allows access to many varied and new hardware. It includes specifics for implementation in Fortran and C. In the future, MPI will allow development of parallel software libraries.
What is not in MPI? I/O routines are not part of the MPI specification. For example, how processes arrive at processors is not specified. Also, spawning of processes during execution is not included. This implies MPI starts and finishes with the same number of processes. MPI does not include debugging tools either. However, the MPI standard is growing and MPI-2 will be finished in 1996. Future additions may incorporate these features and specific implementations can also incorporate any of these features as extensions.
Where did MPI come from? Sixty people from forty organizations worked out a standard over two years from 1992 to 1994. This "MPI Forum" included representatives from IBM, Cray, Los Alamos, UCSB, Cornell, Rice and Yale as well as others. In May 1994 the standard was complete and available to all. One can receive a copy from the MPI Web Page.
Now I would like to introduce some of its features. Much of this description is from Chapter 8 of Designing and Building Parallel Programs by Ian Foster. The standard contains over 120 subroutines and functions, but, one can do much with only six of them! These six "workhorses" of MPI are: MPI_INIT, MPI_FINALIZE, MPI_COMM_SIZE, MPI_COMM_RANK, MPI_SEND and MPI_RECV. The first two initialize and terminate an MPI program. The next two determine how many processors an MPI program is running on as well as specific processor identifications. The last two subroutines are used to send and receive messages.
MPI does not guarantee determinism. In other words, messages sent from different processors to the same process can not be expected to arrive in any specific order. Nevertheless, with careful programming, it is possible to provide this feature. The send and receive commands allow one to incorporate a "tag" to messages as well as provide identifcation of sender and receiver. While it is possible to neglect using these options by designating source arguments with MPI_ANY_SOURCE and tag arguments with MPI_ANY_TAG, it is recommended that this be avoided for safety.
In evaluating MPI, the question naturally arises as to how it compares to previous portable languages such as PVM. There are no major differences in power or functionality, but, rather in style. PVM provides a mechanism to cleanly add and delete hosts, spawn and kill processes, and other similar features. MPI does not support this type of process management, yet, can still create "sleeping processes" which basically accomplish the same thing. MPI provides easy to use message passing commands. For example, the user does not have to worry about packing and unpacking data like in PVM. Nevertheless, PVM can avoid packing by sending "raw" data such as for a homogenous cluster. These examples show the biggest difference between both interfaces is with regards to style. More importantly, MPI is gaining momentum as a new de facto standard and will include future versions such as MPI-2. It is not clear that PVM can create or maintain the momentum MPI has generated.
MPI provides many other advanced features. MPI includes support for asynchronous communication. This is a powerful feature, for example, when writing a program that must send many large messages. MPI also supports the organizing of processes into groups. For example, often one wants to perform a broadcast to a subset of the total number of processes. By creating groups called communicators, this is facilitated. Many other global features are also supported.
Performance is an issue for parallel programming just as important as portability. MPI was designed to practically and efficiently replace existing message passing formats. For example, the Paragon's NX interface almost sits right below MPI, such that in many cases, MPI merely calls an analogous NX command. In this respect, using MPI on a Paragon would have comparable performance to its own native interface! It must be stressed that a large percentage of responsibility for MPI's performance lies in the quality of the implementation. As MPI becomes more and more popular, more work will be invested in optimizing it for many varied machines.
In summary, I have introduced MPI and explained many of its features. Clearly, MPI provides an efficient portable standard for writing message passing programs. It is the de facto standard and is being implemented on a wide variety of architectures. Furthermore, MPI is a growing standard and MPI-2 will be completed in 1996. While providing over 120 commands, one can produce many powerful programs with just six of them. MPI can also provide just as much, if not more, power and functionality as PVM.