Executing Long-running Multi-component Applications on Batch Grids

Computational grids are increasingly being used for executing large multi-component scientific applications. The most widely reported advantages of application execution on grids are the performance benefits, in terms of speeds, problem sizes or quality of solutions, due to increased number of processors. We explore the possibility of improved performance on grids without increasing the application’s processor space. For this, we consider grids with multiple batch systems. We explore the challenges involved in and the advantages of executing long-running multi-component applications on multiple batch sites with a popular multi-component climate simulation application, CCSM, as the motivation.We have performed extensive simulation studies to estimate the single and multi-site execution rates of the applications for different system characteristics.Our experiments show that in many cases, multiple batch executions can have better execution rates than a single site execution.

[1]  K. Taylor,et al.  The Community Climate System Model , 2001 .

[2]  Dror G. Feitelson,et al.  The workload on parallel supercomputers: modeling the characteristics of rigid jobs , 2003, J. Parallel Distributed Comput..

[3]  R. Wolski,et al.  GridSAT: A Chaff-based Distributed SAT Solver for the Grid , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[4]  Matthias S. Müller,et al.  A global grid for analysis of arthropod evolution , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[5]  W. Collins,et al.  The Community Climate System Model: CCSM3 , 2004 .

[6]  Charles L. Brooks,et al.  Predictor@Home: a "protein structure prediction supercomputer" based on public-resource computing , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[7]  Wu-chun Feng,et al.  Parallel Genomic Sequence-Searching on an Ad-Hoc Grid: Experiences, Lessons Learned, and Implications , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[8]  Charles L. Brooks,et al.  Predictor@Home: A "Protein Structure Prediction Supercomputer' Based on Global Computing , 2006, IEEE Transactions on Parallel and Distributed Systems.

[9]  Suchuan Dong,et al.  Grid solutions for biological and physical cross-site simulations on the TeraGrid , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[10]  Mehmet M. Dalkilic,et al.  High-Performance Direct Pairwise Comparison of Large Genomic Sequences , 2006, IEEE Transactions on Parallel and Distributed Systems.

[11]  Richard O. Sinnott,et al.  Towards a Grid-Enabled Simulation Framework for Nano-CMOS Electronics , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[12]  Gilbert Poulard,et al.  Large-Scale ATLAS Simulated Production on EGEE , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).