A probabilistic scheduling heuristic for computational grids

Computational grids are large scale distributed networks of peer clusters of computing resources bounded by a decentralized management framework for the purpose of providing computing services, called grid services. The scheduling problem consists in finding the clusters that host the required set of grid services with a sufficient available capacity to handle user service requests in compliance with some specified quality of service. The interplay of intermittent resource participation, resource load dynamics, network latency and processing delay, and random subsystem failures creates a ubiquitous uncertainty on the state of the grid capacity to handle user requests. In addition to the need to account for this uncertainty, the scheduling strategy has to be decentralized since computational grids span distinct management domains. In this paper, we propose a decentralized scheduling strategy that views the dynamics of the grid service capacity as a stochastic process modeled by a Markov chain. The proposed scheduling scheme uses this model to predict the future local availability of resources. This is consolidated by a confidence model that approximates the future ability of peer clusters to successfully handle delegated service requests. The scalability of the proposed scheduling strategy is illustrated through simulation.

[1]  Gregor von Laszewski,et al.  QoS guided Min-Min heuristic for grid task scheduling , 2003, Journal of Computer Science and Technology.

[2]  Rajkumar Buyya,et al.  A taxonomy and survey of grid resource management systems for distributed computing , 2002, Softw. Pract. Exp..

[3]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[4]  Yang Gao,et al.  Adaptive grid job scheduling with genetic algorithms , 2005, Future Gener. Comput. Syst..

[5]  Stephen A. Jarvis,et al.  Grid load balancing using intelligent agents , 2005, Future Gener. Comput. Syst..

[6]  Sheldon M. Ross,et al.  Introduction to probability models , 1975 .

[7]  Alfredo Goldman,et al.  A model for parallel job scheduling on dynamical computer Grids , 2003, Concurr. Pract. Exp..

[8]  Mayez A. Al-Mouhamed,et al.  Lower Bound on the Number of Processors and Time for Scheduling Precedence Graphs with Communication Costs , 1990, IEEE Trans. Software Eng..

[9]  J. Gustafson The program of grand challenge problems: expectations and results , 1997, Proceedings of IEEE International Symposium on Parallel Algorithms Architecture Synthesis.

[10]  Ishfaq Ahmad,et al.  Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors , 1996, IEEE Trans. Parallel Distributed Syst..

[11]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[12]  Francine Berman,et al.  Adaptive Computing on the Grid Using AppLeS , 2003, IEEE Trans. Parallel Distributed Syst..

[13]  Michel Cosnard,et al.  Automatic task graph generation techniques , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[14]  Henri Casanova,et al.  Netsolve: a Network-Enabled Server for Solving Computational Science Problems , 1997, Int. J. High Perform. Comput. Appl..

[15]  Rajkumar Buyya,et al.  High Performance Cluster Computing , 1999 .

[16]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[17]  Subhash Saini,et al.  ARMS: An agent-based resource management system for grid computing , 2002, Sci. Program..

[18]  Thomas L. Casavant,et al.  A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems , 1988, IEEE Trans. Software Eng..

[19]  Y.-K. Kwok,et al.  Static scheduling algorithms for allocating directed task graphs to multiprocessors , 1999, CSUR.

[20]  Youcef Derbal,et al.  Service Oriented Grid Resource Modeling and Management , 2005, WEBIST.

[21]  David Abramson,et al.  A Computational Economy for Grid Computing and its Implementation in the Nimrod-G Resource Brok , 2001, Future Gener. Comput. Syst..

[22]  Ian T. Foster,et al.  Agreement-Based Resource Management , 2005, Proceedings of the IEEE.

[23]  John F. Karpovich,et al.  Resource management in Legion , 1999, Future Gener. Comput. Syst..

[24]  Mark J. Clement,et al.  Core Algorithms of the Maui Scheduler , 2001, JSSPP.

[25]  Frank D. Anger,et al.  Scheduling Precedence Graphs in Systems with Interprocessor Communication Times , 1989, SIAM J. Comput..

[26]  Hesham H. Ali,et al.  Task scheduling in parallel and distributed systems , 1994, Prentice Hall series in innovative technology.

[27]  Francine Berman,et al.  Distributing MCell Simulations on the Grid , 2001, Int. J. High Perform. Comput. Appl..

[28]  Ming Wu,et al.  Grid Harvest Service: a system for long-term, application-level task scheduling , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[29]  Stephen A. Jarvis,et al.  Hybrid performance-based workload management for multiclusters and grids , 2004, IEE Proc. Softw..

[30]  Ishfaq Ahmad,et al.  On Parallelizing the Multiprocessor Scheduling Problem , 1999, IEEE Trans. Parallel Distributed Syst..

[31]  Vincenzo Di Martino,et al.  Sub optimal scheduling in a grid using genetic algorithms , 2003, Parallel Comput..

[32]  Subhash Saini,et al.  Local grid scheduling techniques using performance prediction , 2003 .

[33]  Ian T. Foster,et al.  Homeostatic and tendency-based CPU load predictions , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[34]  Munindar P. Singh,et al.  Service-Oriented Computing: Semantics, Processes, Agents , 2010 .

[35]  Francine Berman,et al.  Resource Allocation Strategies for Guided Parameter Space Searches , 2003, Int. J. High Perform. Comput. Appl..

[36]  José A. B. Fortes,et al.  PUNCH: An architecture for Web-enabled wide-area network-computing , 2004, Cluster Computing.

[37]  Ladislau Bölöni,et al.  A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..

[38]  Chuliang Weng,et al.  Heuristic scheduling for bag-of-tasks applications in combination with QoS in the computational grid , 2005, Future Gener. Comput. Syst..

[39]  Robert L. Henderson,et al.  Job Scheduling Under the Portable Batch System , 1995, JSSPP.

[40]  Lingyun Yang,et al.  Conservative Scheduling: Using Predicted Variance to Improve Scheduling Decisions in Dynamic Environments , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[41]  Eduardo Huedo,et al.  A framework for adaptive execution in grids , 2004, Softw. Pract. Exp..

[42]  Artur Andrzejak,et al.  Service-Centric Globally Distributed Computing , 2003, IEEE Internet Comput..

[43]  Arnold L. Rosenberg,et al.  Guidelines for scheduling some common computation-dags for Internet-based computing , 2005, IEEE Transactions on Computers.

[44]  Sheldon M. Ross,et al.  Introduction to Probability Models, Eighth Edition , 1972 .

[45]  Jingwen Wang,et al.  Utopia: A load sharing facility for large, heterogeneous distributed computer systems , 1993, Softw. Pract. Exp..

[46]  Ishfaq Ahmad,et al.  Benchmarking the task graph scheduling algorithms , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[47]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.

[48]  Mitsuhisa Sato,et al.  Design and implementations of Ninf: towards a global computing infrastructure , 1999, Future Gener. Comput. Syst..

[49]  Bin Yao,et al.  A taxonomy for describing matching and scheduling heuristics for mixed-machine heterogeneous computing systems , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).

[50]  David Abramson,et al.  The Virtual Laboratory: a toolset to enable distributed molecular modelling for drug design on the World‐Wide Grid , 2003, Concurr. Comput. Pract. Exp..

[51]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[52]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[53]  Francine Berman,et al.  New Grid Scheduling and Rescheduling Methods in the GrADS Project , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[54]  David Abramson,et al.  The Grid Economy , 2005, Proceedings of the IEEE.

[55]  Naga K. c. Krothapalli,et al.  Dynamic allocation of communicating tasks in computational grids , 2004 .

[56]  L RosenbergArnold,et al.  Guidelines for Scheduling Some Common Computation-Dags for Internet-Based Computing , 2005 .

[57]  Eduardo Huedo,et al.  Experiences on adaptive grid scheduling of parameter sweep applications , 2004, 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings..

[58]  Layuan Li,et al.  A distributed utility-based two level market solution for optimal resource scheduling in computational grid , 2005, Parallel Comput..

[59]  Ishfaq Ahmad,et al.  Parallel Program Scheduling Techniques , 2000 .

[60]  David Abramson,et al.  Economic models for resource management and scheduling in Grid computing , 2002, Concurr. Comput. Pract. Exp..

[61]  Changyeol Choi,et al.  Efficient Dynamic Resource Reallocation Scheme Using Time-Slot Connection Pattern , 2003, PDPTA.

[62]  Peter A. Dinda Online prediction of the running time of tasks , 2001, SIGMETRICS '01.