Experiments in running a scientific MPI application on Grid'5000

Over the last couple of years, several dedicated grid platforms have been set up to test applications and middleware for grids. Among these is Grid'5000, a reconfigurable platform gathering resources at nine remote geographical sites in France. This paper presents one of the eight experiments that have tested software scalability at the scale of a thousand processors (i.e. 500-1000) on this grid testbed. The experiment aims at analyzing the behavior of a geophysical application (a seismic ray tracing in a 3D mesh of the Earth). The application is computationally intensive but requires an all-to-all communication phase during which processors exchange their results, which has shown to be a real bottleneck on many hardware platforms. We analyze various runs and show that this application scales well up to about 500 processors on such a grid.

[1]  Stéphane Genaud,et al.  Seismic Ray-Tracing and Earth Mesh Modeling on Various Parallel Architectures , 2004, The Journal of Supercomputing.

[2]  Stéphane Genaud,et al.  Calcul de rais en tomographie sismique. Exploitation sur la grille , 2005, Tech. Sci. Informatiques.

[3]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[4]  Franck Cappello,et al.  Grid'5000: a large scale and highly reconfigurable grid experimental testbed , 2005, The 6th IEEE/ACM International Workshop on Grid Computing, 2005..

[5]  Franck Cappello,et al.  Grid'5000: a large scale, reconfigurable, controlable and monitorable Grid platform , 2005 .

[6]  Jacques M. Bahi,et al.  Evaluation of the asynchronous iterative algorithms in the context of distant heterogeneous clusters , 2005, Parallel Comput..

[7]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[8]  Michael M. Resch,et al.  Distributed Computing in a Heterogeneous Computing Environment , 1998, PVM/MPI.

[9]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[10]  Rob van Nieuwpoort,et al.  MPJ/Ibis: A Flexible and Efficient Message Passing Platform for Java , 2005, PVM/MPI.

[11]  Jack J. Dongarra,et al.  FT-MPI: Fault Tolerant MPI, Supporting Dynamic Applications in a Dynamic World , 2000, PVM/MPI.

[12]  Greg Burns,et al.  LAM: An Open Cluster Environment for MPI , 2002 .

[13]  Ian T. Foster,et al.  MPICH-G2: A Grid-enabled implementation of the Message Passing Interface , 2002, J. Parallel Distributed Comput..

[14]  G. Allen,et al.  Supporting Efficient Execution in Heterogeneous Distributed Computing Environments with Cactus and Globus , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[15]  B. Bouteiller,et al.  MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[16]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[17]  Michael M. Resch,et al.  Software Development in the Grid: The DAMIEN Tool-Set , 2003, International Conference on Computational Science.

[18]  Ian T. Foster,et al.  Supporting Efficient Execution in Heterogeneous Distributed Computing Environments with Cactus and Globus , 2001, ACM/IEEE SC 2001 Conference (SC'01).