The scalable process topology interface of MPI 2.2

The Message‐passing Interface (MPI) standard provides basic means for adaptations of the mapping of MPI process ranks to processing elements to better match the communication characteristics of applications to the capabilities of the underlying systems. The MPI process topology mechanism enables the MPI implementation to rerank processes by creating a new communicator that reflects user‐supplied information about the application communication pattern. With the newly released MPI 2.2 version of the MPI standard, the process topology mechanism has been enhanced with new interfaces for scalable and informative user‐specification of communication patterns. Applications with relatively static communication patterns are encouraged to take advantage of the mechanism whenever convenient by specifying their communication pattern to the MPI library. Reference implementations of the new mechanism can be expected to be readily available (and come at essentially no cost), but non‐trivial implementations pose challenging problems for the MPI implementer. This paper is first and foremost addressed to application programmers wanting to use the new process topology interfaces. It explains the use and the motivation for the enhanced interfaces and the advantages gained even with a straightforward implementation. For the MPI implementer, the paper summarizes the main issues in the efficient implementation of the interface and explains the optimization problems that need to be (approximately) solved by a good MPI library. Copyright © 2010 John Wiley & Sons, Ltd.

[1]  Torsten Hoefler,et al.  Sparse collective operations for MPI , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[2]  Sophie Valcke,et al.  PRISM and ENES: a European approach to Earth system modelling , 2006, Concurr. Comput. Pract. Exp..

[3]  Torsten Hoefler,et al.  Scalable communication protocols for dynamic sparse data exchange , 2010, PPoPP '10.

[4]  Robert Meakin,et al.  Composite Overset Structured Grids , 1998 .

[5]  H. Ritzdorf,et al.  OASIS4 – a coupling software for next generation earth system modelling , 2009 .

[6]  Burkhard Monien,et al.  Embedding one interconnection network in another , 1990 .

[7]  Peter Sanders,et al.  Engineering a scalable high quality graph partitioner , 2009, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[8]  Torsten Hoefler,et al.  Sparse Non-blocking Collectives in Quantum Mechanical Calculations , 2008, PVM/MPI.

[9]  Jean Roman,et al.  SCOTCH: A Software Package for Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs , 1996, HPCN Europe.

[10]  Laxmikant V. Kalé,et al.  Benefits of Topology Aware Mapping for Mesh Interconnects , 2008, Parallel Process. Lett..

[11]  Jesper Larsson Träff Implementing the MPI process topology mechanism , 2002, SC '02.

[12]  Chris Walshaw,et al.  Multilevel mesh partitioning for heterogeneous communication networks , 2001, Future Gener. Comput. Syst..

[13]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[14]  Torsten Hoefler,et al.  Multistage switches are not crossbars: Effects of static routing in high-performance networks , 2008, 2008 IEEE International Conference on Cluster Computing.

[15]  R Calkin,et al.  Portable Programming with the PARMACS Message-Passing Library , 1994, Parallel Comput..

[16]  Arnold L. Rosenberg,et al.  Cost Trade-offs in Graph Embeddings, with Applications , 1983, JACM.

[17]  Sowmyanarayanan Sadagopan,et al.  WWW: service provider , 2002, UBIQ.

[18]  William J. Dally,et al.  Performance Analysis of k-Ary n-Cube Interconnection Networks , 1987, IEEE Trans. Computers.

[19]  Michael Lautenschlager,et al.  A European Network for Earth System Modeling , 2007 .

[20]  Cecelia DeLuca,et al.  The architecture of the Earth System Modeling Framework , 2003, Computing in Science & Engineering.

[21]  Jesper Larsson Träff,et al.  MPI on a Million Processors , 2009, PVM/MPI.

[22]  Jesper Larsson Träff SMP-aware message passing programming , 2003, Eighth International Workshop on High-Level Parallel Programming Models and Supportive Environments, 2003. Proceedings..

[23]  Takao Hatazaki,et al.  Rank Reordering Strategy for MPI Topology Creation Functions , 1998, PVM/MPI.

[24]  Anton Schüller Portable Parallelization of Industrial Aerodynamic Applications (POPINDA) , 1999 .

[25]  Pavel Tvrdík,et al.  Uniform homomorphisms of de Bruijn and Kautz networks , 1998, Discret. Appl. Math..

[26]  Galen M. Shipman,et al.  Infiniband scalability in Open MPI , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[27]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[28]  Jesper Larsson Träff,et al.  What MPI Could (and Cannot) Do for Mesh-Partitioning on Non-homogeneous Networks , 2006, PVM/MPI.

[29]  José E. Moreira,et al.  Topology Mapping for Blue Gene/L Supercomputer , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[30]  Shujia Zhou Coupling climate models with the Earth System Modeling Framework and the Common Component Architecture , 2006, Concurr. Comput. Pract. Exp..

[31]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .

[32]  François Pellegrini Graph Partitioning Based Methods and Tools for Scientific Computing , 1997, Parallel Comput..

[33]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[34]  Chris Walshaw,et al.  Mesh Partitioning: A Multilevel Balancing and Refinement Algorithm , 2000, SIAM J. Sci. Comput..

[35]  François Pellegrini,et al.  PT-Scotch: A tool for efficient parallel graph ordering , 2008, Parallel Comput..

[36]  Emmanuel Jeannot,et al.  Near-Optimal Placement of MPI Processes on Hierarchical NUMA Architectures , 2010, Euro-Par.

[37]  Guillaume Mercier,et al.  Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments , 2009, PVM/MPI.

[38]  Arnold L. Rosenberg,et al.  Issues in the Study of Graph Embeddings , 1980, WG.

[39]  H. Appel,et al.  octopus: a tool for the application of time‐dependent density functional theory , 2006 .