Steady-State Scheduling of Multiple Divisible Load Applications on Wide-Area Distributed Computing Platforms

Divisible load applications consist of an amount of data and associated computation that can be divided arbitrarily into any number of independent pieces. This model is a good approximation of many real-world scientific applications, lends itself to a natural master-worker implementation, and has thus received a lot of attention. The critical issue of divisible load scheduling has been studied extensively in previous work. However, only a few authors have explored the simultaneous scheduling of multiple such applications on a distributed computing platform. We focus on this increasingly relevant scenario and make the following contributions. We use a novel and more realistic platform model that captures some of the fundamental network properties of grid platforms. We formulate the steady-state multi-application scheduling problem as a linear program that expresses a notion of fairness between applications. This scheduling problem is NP-complete and we propose several heuristics that we evaluate and compare via extensive simulation experiments. Our main finding is that some of our heuristics can achieve performance close to the optimal and we quantify the trade-offs between achieved performance and heuristic complexity.

[1]  David Gamarnik,et al.  Asymptotically Optimal Algorithms for Job Shop Scheduling and Packet Routing , 1999, J. Algorithms.

[2]  Thomas G. Robertazzi,et al.  Divisible Load Scheduling for Grid Computing , 2003 .

[3]  Thomas G. Robertazzi,et al.  Closed Form Solutions for Bus and Tree Networks of Processors Load Sharing A Divisible Job , 1993, ICPP.

[4]  Dimitri P. Bertsekas,et al.  Data Networks , 1986 .

[5]  Debasish Ghose,et al.  Divisible Load Theory: A New Paradigm for Load Scheduling in Distributed Systems , 2004, Cluster Computing.

[6]  Ladislau Bölöni,et al.  A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..

[7]  Henri Casanova,et al.  Scheduling divisible loads on star and tree networks: results and open problems , 2005, IEEE Transactions on Parallel and Distributed Systems.

[8]  D. Gamarnik,et al.  An asymptotically optimal algorithm for job shop scheduling , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).

[9]  Debasish Ghose,et al.  Foreword (Special Issue of Cluster Computing on Divisible Load Scheduling) , 2004, Cluster Computing.

[10]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[11]  James Demmel,et al.  Models and Scheduling Algorithms for Mixed Data and Task Parallel Programs , 1997, J. Parallel Distributed Comput..

[12]  Giorgio Gambosi,et al.  Complexity and Approximation , 1999, Springer Berlin Heidelberg.

[13]  Henri Casanova,et al.  Scheduling distributed applications: the SimGrid simulation framework , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[14]  Gerassimos D. Barlas Collection-Aware Optimum Sequencing of Operations and Closed-Form Solutions for the Distribution of a Divisible Load on Arbitrary Processor Trees , 1998, IEEE Trans. Parallel Distributed Syst..

[15]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[16]  Francine Berman,et al.  Grid Computing: Making the Global Infrastructure a Reality , 2003 .

[17]  Francine Berman,et al.  Overview of the Book: Grid Computing – Making the Global Infrastructure a Reality , 2003 .

[18]  J. Stepanek,et al.  On Future Global Grid Communication Performance1 , 2001 .

[19]  Larry Carter,et al.  Scheduling strategies for master-slave tasking on heterogeneous processor platforms , 2004, IEEE Transactions on Parallel and Distributed Systems.

[20]  Thomas G. Robertazzi,et al.  Ten Reasons to Use Divisible Load Theory , 2003, Computer.

[21]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[22]  Larry Carter,et al.  Autonomous protocols for bandwidth-centric scheduling of independent-task applications , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[23]  Thomas G. Robertazzi Processor equivalence for daisy chain load sharing processors , 1993 .

[24]  Henri Casanova,et al.  Modeling large-scale platforms for the analysis and the simulation of scheduling strategies , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[25]  Yves Robert,et al.  Scheduling divisible workloads on heterogeneous platforms , 2003, Parallel Comput..

[26]  Jaspal Subhlok,et al.  Optimal Use of Mixed Task and Data Parallelism for Pipelined Computations , 2000, J. Parallel Distributed Comput..

[27]  D. Coudert,et al.  Lightpath assignment for multifibers wdm optical networks with wavelength translators , 2002 .

[28]  Dantong Yu,et al.  Data Intensive Grid Scheduling: Multiple Sources with Capacity Constraints , 2003 .

[29]  Joel H. Saltz,et al.  Optimizing execution of component-based applications using group instances , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[30]  Thomas G. Robertazzi Processor equivalence for a linear daisy chain of load sharing processors , 1992 .

[31]  Dorit S. Hochbaum,et al.  Approximation Algorithms for NP-Hard Problems , 1996 .

[32]  Bharadwaj Veeravalli,et al.  Efficient Scheduling Strategies for Processing Multiple Divisible Loads on Bus Networks , 2002, J. Parallel Distributed Comput..

[33]  Henri Casanova,et al.  Parameter Sweeps on the Grid with APST , 2003 .

[34]  Kenneth L. Calvert,et al.  Modeling Internet topology , 1997, IEEE Commun. Mag..

[35]  Steven Tuecke,et al.  GridFTP: Protocol Extensions to FTP for the Grid , 2001 .

[36]  Debasish Ghose,et al.  Scheduling Divisible Loads in Parallel and Distributed Systems , 1996 .

[37]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[38]  Joel H. Saltz,et al.  Executing Multiple Pipelined Data Analysis Operations in the Grid , 2002, ACM/IEEE SC 2002 Conference (SC'02).