Resource-Aware Distributed Scheduling Strategies for Large-Scale Computational Cluster/Grid Systems

In this paper, we propose distributed algorithms referred to as resource-aware dynamic incremental scheduling (RADIS) strategies. Our strategies are specifically designed to handle large volumes of computationally intensive arbitrarily divisible loads submitted for processing at cluster/grid systems involving multiple sources and sinks (processing nodes). We consider a real-life scenario, wherein the buffer space (memory) available at the sinks (required for holding and processing the loads) varies over time, and the loads have deadlines and propose efficient "pull-based" scheduling strategies with an admission control policy that ensures that the admitted loads are processed, satisfying their deadline requirements. The design of our proposed strategies adopts the divisible load paradigm, referred to as the divisible load theory (DLT), which is shown to be efficient in handling large volume loads. We demonstrate detailed workings of the proposed algorithms via a simulation study by using real-life parameters obtained from a major physics experiment.

[1]  Ping Li,et al.  Design and implementation of parallel video encoding strategies using divisible load analysis , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Bharadwaj Veeravalli,et al.  Divisible load scheduling on single-level tree networks with buffer constraints , 2000, IEEE Trans. Aerosp. Electron. Syst..

[3]  Pawel Wolniewicz,et al.  Experiments with Scheduling Divisible Tasks in Clusters of Workstations , 2000, Euro-Par.

[4]  Debasish Ghose,et al.  Scheduling Divisible Loads in Parallel and Distributed Systems , 1996 .

[5]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[6]  Thomas G. Robertazzi,et al.  Distributed computation with communication delay (distributed intelligent sensor networks) , 1988 .

[7]  Dantong Yu,et al.  GRID SCHEDULING DIVISIBLE LOADS FROM MULTIPLE SOURCES VIA LINEAR PROGRAMMING , 2004 .

[8]  Henri Casanova,et al.  A realistic network/application model for scheduling divisible loads on large-scale platforms , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[9]  Debasish Ghose,et al.  Distributed Computation with Communication Delays: Asymptotic Performance Analysis , 1994, J. Parallel Distributed Comput..

[10]  Bharadwaj Veeravalli,et al.  Design and performance evaluation of load distribution strategies for multiple divisible loads on heterogeneous linear daisy chain networks , 2005, J. Parallel Distributed Comput..

[11]  Debasish Ghose,et al.  Divisible Load Theory: A New Paradigm for Load Scheduling in Distributed Systems , 2004, Cluster Computing.

[12]  Debasish Ghose,et al.  Large matrix-vector products on distributed bus networks with communication delays using the divisible load paradigm: performance analysis and simulation , 2001, Math. Comput. Simul..

[13]  Marcin Paprzycki,et al.  Handbook on Parallel and Distributed Processing , 2001 .

[14]  Bharadwaj Veeravalli,et al.  Theoretical and experimental study on large size image processing applications using divisible load paradigm on distributed bus networks , 2002, Image Vis. Comput..

[15]  Hyoung Joong Kim A Novel Optimal Load Distribution Algorithm for Divisible Loads , 2004, Cluster Computing.

[16]  Marcin Paprzycki,et al.  Distributed Computing: Fundamentals, Simulations and Advanced Topics , 2001, Scalable Comput. Pract. Exp..

[17]  Henri Casanova,et al.  Scheduling divisible loads on star and tree networks: results and open problems , 2005, IEEE Transactions on Parallel and Distributed Systems.

[18]  C. Siva Ram Murthy,et al.  Distributed computation for a hypercube network of sensor-driven processors with communication delays including setup time , 1998, IEEE Trans. Syst. Man Cybern. Part A.

[19]  H. V. Jagadish,et al.  Partitioning Techniques for Large-Grained Parallelism , 1988, IEEE Trans. Computers.

[20]  Debasish Ghose,et al.  Adaptive divisible load scheduling strategies for workstation clusters with unknown network resources , 2005, IEEE Transactions on Parallel and Distributed Systems.

[21]  Jon B. Weissman,et al.  A genetic algorithm based approach for scheduling decomposable data grid applications , 2004, International Conference on Parallel Processing, 2004. ICPP 2004..

[22]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[23]  Ming Wu,et al.  Memory conscious task partition and scheduling in grid environments , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[24]  Dantong Yu,et al.  Data Intensive Grid Scheduling: Multiple Sources with Capacity Constraints , 2003 .

[25]  Hagit Attiya,et al.  Distributed Computing: Fundamentals, Simulations and Advanced Topics , 1998 .

[26]  Wu-chun Feng,et al.  The design, implementation, and evaluation of mpiBLAST , 2003 .

[27]  Klaus H. Ecker,et al.  Handbook on Parallel and Distributed Processing , 2000, International Handbooks on Information Systems.

[28]  B. Veeravalli Design and Performance Analysis of Heuristic Load-Balancing Strategies for Processing Divisible Loads on Ethernet Clusters , 2005 .