MPI Applications on Grids: A Topology Aware Approach

Porting on grids complex MPI applications involving collective communications requires significant program modification, usually dedicated to a single grid structure. The difficulty comes from the mismatch between programs organizations and grid structures: 1) large grids are hierarchical structures aggregating parallel machines through an interconnection network, decided at runtime and 2) the MPI standard does not currently provide any specific information for topology-aware applications, so almost all MPI applications have been developed following a non-hierarchical and non-flexible vision. In this paper, we propose a generic programming method and a modification of the MPI runtime environment to make MPI applications topology aware. In contrary to previous approaches, topology requirements for the application are given to the grid scheduling system, which exposes the compatible allocated topology to the application.

[1]  Thomas Hérault,et al.  Grid Services for MPI , 2007, PVM/MPI.

[2]  Miron Livny,et al.  Condor and the Grid , 2003 .

[3]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[4]  Marco Aldinucci,et al.  Computational Science - ICCS 2008, 8th International Conference, Kraków, Poland, June 23-25, 2008, Proceedings, Part I , 2008, ICCS.

[5]  Norbert Meyer,et al.  Euro-Par 2006 Workshops: Parallel Processing , 2007, Lecture Notes in Computer Science.

[6]  Emanouil I. Atanassov,et al.  Monte Carlo Grid Application for Electron Transport , 2006, International Conference on Computational Science.

[7]  Ian T. Foster,et al.  MPICH-G2: A Grid-enabled implementation of the Message Passing Interface , 2002, J. Parallel Distributed Comput..

[8]  Henri E. Bal,et al.  MagPIe: MPI's collective communication operations for clustered wide area systems , 1999, PPoPP '99.

[9]  Jack Dongarra,et al.  Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, Dublin, Ireland, September 7-10, 2008. Proceedings , 2008, PVM/MPI.

[10]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[11]  Bronis R. de Supinski,et al.  Exploiting hierarchy in parallel computer networks to optimize collective operation performance , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[12]  Francine Berman,et al.  Grid Computing: Making the Global Infrastructure a Reality , 2003 .

[13]  Cristina Boeres,et al.  Managing the execution of large scale MPI applications on computational grids , 2005, 17th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'05).

[14]  Robert A. van de Geijn,et al.  Building a high-performance collective communication library , 1994, Proceedings of Supercomputing '94.

[15]  Assaf Schuster,et al.  A Fast and Efficient Algorithm for Topology-Aware Coallocation , 2008, ICCS.

[16]  Stephen Gilmore,et al.  Combining Measurement and Stochastic Modelling to Enhance Scheduling Decisions for a Parallel Mean Value Analysis Algorithm , 2006, International Conference on Computational Science.

[17]  Daniel Millot,et al.  An Adaptive Scheduling Method for Grid Computing , 2006, Euro-Par.

[18]  Michael M. Resch,et al.  Distributed Computing in a Heterogeneous Computing Environment , 1998, PVM/MPI.

[19]  Stéphane Genaud,et al.  Parallel Seismic Ray Tracing in a Global Earth Model , 2002, PDPTA.

[20]  Franck Cappello,et al.  Grid'5000: a large scale and highly reconfigurable grid experimental testbed , 2005, The 6th IEEE/ACM International Workshop on Grid Computing, 2005..

[21]  Ian T. Foster,et al.  Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, Journal of Computer Science and Technology.

[22]  Stephen Gilmore,et al.  Evaluating the Performance of Skeleton-Based High Level Parallel Programs , 2004, International Conference on Computational Science.

[23]  Rolf Rabenseifner,et al.  Optimization of Collective Reduction Operations , 2004, International Conference on Computational Science.

[24]  Franck Cappello,et al.  HiHCoHP-Toward a realistic communication model for hierarchical hyperclusters of heterogeneous processors , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[25]  Yuetsu Kodama,et al.  TCP Adaptation for MPI on Long-and-Fat Networks , 2005, 2005 IEEE International Conference on Cluster Computing.

[26]  Ivan Tomov Dimov,et al.  Monte Carlo methods for matrix computations on the grid , 2008, Future Gener. Comput. Syst..