Author's Personal Copy Future Generation Computer Systems Grids with Multiple Batch Systems for Performance Enhancement of Multi-component and Parameter Sweep Parallel Applications

In this work, we evaluate the benefits of using Grids with multiple batch systems to improve the performance of multi-component and parameter sweep parallel applications by reduction in queue waiting times. Using different job traces of different loads, job distributions and queue waiting times corresponding to three different queuing policies (FCFS, conservative and EASY backfilling), we conducted a large number of experiments using simulators of two important classes of applications. The first simulator models Community Climate System Model (CCSM), a prominent multi-component application and the second simulator models parameter sweep applications. We compare the performance of the applications when executed on multiple batch systems and on a single batch system for different system and application configurations. We show that there are a large number of configurations for which application execution using multiple batch systems can give improved performance over execution on a single system.

[1]  Dror G. Feitelson,et al.  The workload on parallel supercomputers: modeling the characteristics of rigid jobs , 2003, J. Parallel Distributed Comput..

[2]  Wu-chun Feng,et al.  Parallel Genomic Sequence-Searching on an Ad-Hoc Grid: Experiences, Lessons Learned, and Implications , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[3]  Ian Foster,et al.  A quality of service architecture that combines resource reservation and application adaptation , 2000, 2000 Eighth International Workshop on Quality of Service. IWQoS 2000 (Cat. No.00EX400).

[4]  W. Collins,et al.  The Community Climate System Model: CCSM3 , 2004 .

[5]  Douglas Thain,et al.  Building Reliable Clients and Services , 2004, The Grid 2, 2nd Edition.

[6]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[7]  B. F. Spencer,et al.  Distributed hybrid earthquake engineering experiments: experiences with a ground-shaking grid application , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[8]  Dror G. Feitelson,et al.  Utilization and Predictability in Scheduling the IBM SP2 with Backfilling , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[9]  Matthias S. Müller,et al.  A global grid for analysis of arthropod evolution , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[10]  Priya Vashishta,et al.  Sustainable Adaptive Grid Supercomputing: Multiscale Simulation of Semiconductor Processing across the Pacific , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[11]  Jaideep Ray,et al.  A component-based scientific toolkit for reacting flows , 2003 .

[12]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[13]  Oh-Young Kwon,et al.  An NAT-Based Communication Relay Scheme for Private-IP-Enabled MPI over Grid Environments , 2004, International Conference on Computational Science.

[14]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[15]  Anca I. D. Bucur,et al.  Scheduling Policies for Processor Coallocation in Multicluster Systems , 2007, IEEE Transactions on Parallel and Distributed Systems.

[16]  Mahen Jayawardena,et al.  Grid-Enabling an Efficient Algorithm for Demanding Global Optimization Problems in Genetic Analysis , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[17]  Michael M. Resch,et al.  Distributed Computing in a Heterogeneous Computing Environment , 1998, PVM/MPI.

[18]  Rafael Delgado-Buscalioni,et al.  Hybrid molecular-continuum fluid models: implementation within a general coupling framework , 2005, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[19]  Gilbert Poulard,et al.  Large-Scale ATLAS Simulated Production on EGEE , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[20]  Johan Tordsson,et al.  Grid resource brokering algorithms enabling advance reservations and resource selection based on performance predictions , 2008, Future Gener. Comput. Syst..

[21]  Wolfgang Ziegler,et al.  A Meta-scheduling Service for Co-allocating Arbitrary Types of Resources , 2005, PPAM.

[22]  Anca I. D. Bucur,et al.  The maximal utilization of processor co-allocation in multicluster systems , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[23]  Rajkumar Buyya,et al.  A Grid service broker for scheduling e‐Science applications on global data Grids , 2006, Concurr. Comput. Pract. Exp..

[24]  Suchuan Dong,et al.  Grid solutions for biological and physical cross-site simulations on the TeraGrid , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[25]  Quentin F. Stout,et al.  A PHYSICS-BASED SOFTWARE FRAMEWORK FOR SUN-EARTH CONNECTION MODELING , 2005 .

[26]  Anca I. D. Bucur,et al.  Trace-based simulations of processor co-allocation policies in multiclusters , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[27]  Oh-Young Kwon,et al.  MPICH-GP: A Private-IP-Enabled MPI Over Grid Environments , 2004, ISPA.

[28]  R. Wolski,et al.  GridSAT: A Chaff-based Distributed SAT Solver for the Grid , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[29]  Richard O. Sinnott,et al.  Towards a Grid-Enabled Simulation Framework for Nano-CMOS Electronics , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[30]  Thomas Bemmerl,et al.  A message passing interface library for inhomogeneous coupled clusters , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[31]  Christian Grimme,et al.  Cooperative negotiation and scheduling of scientific workflows in the collaborative climate community data and processing grid , 2009, Future Gener. Comput. Syst..

[32]  Charles L. Brooks,et al.  Predictor@Home: A "Protein Structure Prediction Supercomputer' Based on Global Computing , 2006, IEEE Transactions on Parallel and Distributed Systems.

[33]  Henri Casanova,et al.  On the Harmfulness of Redundant Batch Requests , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[34]  Carole A. Goble,et al.  Semantic Matching of Grid Resource Descriptions , 2004, European Across Grids Conference.

[35]  John B. Drake,et al.  Porting and Performance of the Community Climate System Model (CCSM3) on the Cray X1 , 2005 .

[36]  Mehmet M. Dalkilic,et al.  High-Performance Direct Pairwise Comparison of Large Genomic Sequences , 2006, IEEE Transactions on Parallel and Distributed Systems.

[37]  K. Kennedy,et al.  Evaluation of a Workflow Scheduler Using Integrated Performance Modelling and Batch Queue Wait Time Prediction , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[38]  Sayantan Sur,et al.  Community Climate System Model (CCSM) , 2011, Encyclopedia of Parallel Computing.

[39]  Andrew S. Grimshaw,et al.  JobQueue: A Computational Grid-Wide Queueing System , 2001, GRID.

[41]  P. Sadayappan,et al.  Distributed job scheduling on computational Grids using multiple simultaneous requests , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[42]  Péter Kacsuk,et al.  Grid Meta-Broker Architecture: Towards an Interoperable Grid Resource Brokering Service , 2006, Euro-Par Workshops.

[43]  Richard Wolski,et al.  Predicting bounds on queuing delay for batch-scheduled parallel machines , 2006, PPoPP '06.

[44]  José M. Alonso,et al.  A service-oriented WSRF-based architecture for metascheduling on computational Grids , 2008, Future Gener. Comput. Syst..

[45]  Arun Jagatheesan,et al.  Gridflow description, query, and execution at SCEC using the SDSC matrix , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[46]  Abhijit Bose,et al.  MARS: a metascheduler for distributed resources in campus grids , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.