Towards Efficient Execution of MPI Applications on the Grid: Porting and Optimization Issues

The message passing interface (MPI) is a standard used by many parallel scientific applications. It offers the advantage of a smoother migration path for porting applications from high performance computing systems to the Grid. In this paper Grid-enabled tools and libraries for developing MPI applications are presented. The first is MARMOT, a tool that checks the adherence of an application to the MPI standard. The second is PACX-MPI, an implementation of the MPI standard optimized for Grid environments. Besides the efficient development of the program, an optimal execution is of paramount importance for most scientific applications. We therefore discuss not only performance on the level of the MPI library, but also several application specific optimizations, e.g., for a sparse, parallel equation solver and an RNA folding code, like latency hiding, prefetching, caching and topology-aware algorithms.

[1]  Sathish S. Vadhiyar,et al.  Towards an Accurate Model for Collective Communications , 2001, Int. J. High Perform. Comput. Appl..

[2]  Bronis R. de Supinski,et al.  Dynamic Software Testing of MPI Applications with Umpire , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[3]  Flavia Donno,et al.  Grid Data Management in Action: Experience in Running and Supporting Data Management Services in the EU DataGrid Project , 2003, ArXiv.

[4]  Robert L. Grossman,et al.  PSockets: The Case for Application-level Network Striping for Data Intensive Applications using High Speed Wide Area Networks , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[5]  Jack J. Dongarra,et al.  FT-MPI: Fault Tolerant MPI, Supporting Dynamic Applications in a Dynamic World , 2000, PVM/MPI.

[6]  Bronis R. de Supinski,et al.  Exploiting hierarchy in parallel computer networks to optimize collective operation performance , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[7]  James Coyle,et al.  Deadlock detection in MPI programs , 2002, Concurr. Comput. Pract. Exp..

[8]  Jack J. Dongarra,et al.  MPI_Connect Managing Heterogeneous MPI Applications Ineroperation and Process Control , 1998, PVM/MPI.

[9]  Ian T. Foster,et al.  Predicting the performance of wide area data transfers , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[10]  Toshiyuki Imamura,et al.  An Architecture of Stampi: MPI Library on a Cluster of Parallel Computers , 2000, PVM/MPI.

[11]  B. Bouteiller,et al.  MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[12]  Wolfgang E. Nagel,et al.  Group-Based Performance Analysis for Multithreaded SMP Cluster Applications , 2001, Euro-Par.

[13]  Henri E. Bal,et al.  MagPIe: MPI's collective communication operations for clustered wide area systems , 1999, PPoPP '99.

[14]  Rajeev Thakur,et al.  Improving the Performance of Collective Operations in MPICH , 2003, PVM/MPI.

[15]  G. Rose,et al.  Is protein folding hierarchic? I. Local structure and peptide folding. , 1999, Trends in biochemical sciences.

[16]  Francine Berman,et al.  The GrADS Project: Software Support for High-Level Grid Application Development , 2001, Int. J. High Perform. Comput. Appl..

[17]  Roland W. Freund,et al.  A Transpose-Free Quasi-Minimal Residual Algorithm for Non-Hermitian Linear Systems , 1993, SIAM J. Sci. Comput..

[18]  Jesús Labarta,et al.  Validation of Dimemas Communication Model for MPI Collective Operations , 2000, PVM/MPI.

[19]  Domenico Talia,et al.  A Grid Programming Primer , 2001 .

[20]  G. Rose,et al.  Is protein folding hierarchic? II. Folding intermediates and transition states. , 1999, Trends in biochemical sciences.

[21]  Ian T. Foster,et al.  Computational Grids in action: the National Fusion Collaboratory , 2002, Future Gener. Comput. Syst..

[22]  Wolfgang E. Nagel,et al.  Performance Optimization for Large Scale Computing: The Scalable VAMPIR Approach , 2001, International Conference on Computational Science.

[23]  Frédéric Vivien,et al.  Load-balancing scatter operations for Grid computing , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[24]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[25]  Michael M. Resch,et al.  Implementing MPI with Optimized Algorithms for Metacomputing , 1999 .

[26]  Roland Rühle,et al.  Adaptation of a 3-D Flow-Solver for use in a Metacomputing Environment , 1999 .

[28]  Ian T. Foster,et al.  MPICH-G2: A Grid-enabled implementation of the Message Passing Interface , 2002, J. Parallel Distributed Comput..

[29]  G. Allen,et al.  Supporting Efficient Execution in Heterogeneous Distributed Computing Environments with Cactus and Globus , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[30]  Matthias S. Müller,et al.  Grid enabled MPI solutions for clusters , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[31]  Michael M. Resch,et al.  MARMOT: An MPI Analysis and Checking Tool , 2003, PARCO.

[32]  William Gropp Runtime Checking of Datatype Signatures in MPI , 2000, PVM/MPI.