Experiences on adaptive grid scheduling of parameter sweep applications

Grids offer a dramatic increase in the number of available compute and storage resources that can be delivered to applications. This new computational infrastructure provides a promising platform to execute loosely coupled, high-throughput parameter sweep applications. This kind of applications arises naturally in many scientific and engineering fields like bioinformatics, computational fluid dynamics (CFD), particle physics, etc. The efficient execution and scheduling of parameter sweep applications is challenging because of the dynamic and heterogeneous nature of grids. We present a scheduling algorithm built on top of the GridWay framework that combines: (i) adaptive scheduling to reflect the dynamic grid characteristics; (ii) adaptive execution to migrate running jobs to better resources and provide fault tolerance; (iii) re-use of common files between tasks to reduce the file transfer overhead. The efficiency of the approach is demonstrated in the execution of a CFD application on a highly heterogeneous research testbed.

[1]  Norbert Podhorszki,et al.  Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing , 2002 .

[2]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[3]  Gabrielle Allen,et al.  Nomadic migration: a new tool for dynamic grid computing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[4]  Sathish S. Vadhiyar,et al.  A performance oriented migration framework for the grid , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[5]  Ruben S. M Llorente,et al.  Robust multigrid algorithms for the incompressible Navier-Stokes equations , 2000 .

[6]  Francine Berman,et al.  Adaptive Computing on the Grid Using AppLeS , 2003, IEEE Trans. Parallel Distributed Syst..

[7]  Eduardo Huedo,et al.  A framework for adaptive execution in grids , 2004, Softw. Pract. Exp..

[8]  Ian T. Foster,et al.  Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[9]  Eduardo Huedo,et al.  Experiences about Job Migration on a Dynamic Grid Environment , 2003, PARCO.

[10]  David Abramson,et al.  Nimrod/G: an architecture for a resource management and scheduling system in a global computational grid , 2000, Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region.

[11]  Francine Berman,et al.  Heuristics for scheduling parameter sweep applications in grid environments , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[12]  Eduardo Huedo,et al.  Experiences on Grid Resource Selection Considering Resource Proximity , 2003, European Across Grids Conference.

[13]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[14]  Eduardo Huedo,et al.  Grid Resource Selection for Opportunistic Job Migration , 2003, Euro-Par.

[15]  John Shalf,et al.  The Cactus Worm: Experiments with Dynamic Resource Discovery and Allocation in a Grid Environment , 2001, Int. J. High Perform. Comput. Appl..