Cooperative and decentralized workflow scheduling in global grids

Existing Grid scheduling systems, such as e-Science workflow brokers operate in tandem but lack the notion of cooperation mechanism that can lead to efficient application schedules across distributed resources. Lack of coordination exacerbates the utilization of various resources including computing cycles and network bandwidth. Moreover, current brokering systems have evolved around centralized client/server or hierarchical models. The responsibilities of the key functionalities such as resource discovery are delegated to the centralized server machines. Centralized models have well-known drawbacks regarding scalability, single point of failure, and network congestion at links leading to the server. To overcome these problems, this paper proposes a novel approach for decentralized and cooperative workflow scheduling in a dynamic and distributed Grid resource sharing environment. The participants in the system, such as the workflow brokers, resources, and users who belong to multiple control domains, work together to enable a single cooperative resource sharing environment. The proposed approach derives from a Distributed Hash Table (DHT) based d-dimensional logical index space with regard to resource discovery, coordination and overall system decentralization. The DHT-based d-dimensional index space serves as a blackboard system, where distributed participants can post and search complex coordination objects that regulate system wide scheduling decision making. With the implementation of our approach, not only the performance bottlenecks are likely to be eliminated but also efficient scheduling with enhanced scalability will be achieved. We evaluate and prove the feasibility of our approach through an extensive trace-driven simulation. In order to show the performance of the proposed approach against non-cooperative scheduling approach, we conduct experiment for different sizes of workflow. The results show that our scheduling technique can reduce the makespan up to 25% and demonstrates improved load balancing capability. We also compare the performance of the proposed approach against a centralized coordination technique and show that our approach is as efficient as the centralized technique with respect to achieving coordinated schedules.

[1]  Gang Chen,et al.  Coordinated Services Provision in Peer-to-Peer Environments , 2008, IEEE Transactions on Parallel and Distributed Systems.

[2]  Hanan Samet,et al.  Using a distributed quadtree index in peer-to-peer networks , 2007, The VLDB Journal.

[3]  William Gropp,et al.  Skjellum using mpi: portable parallel programming with the message-passing interface , 1994 .

[4]  Warren Smith,et al.  A directory service for configuring high-performance distributed computations , 1997, Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183).

[5]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[6]  Rajkumar Buyya,et al.  A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[7]  Shanika Karunasekera,et al.  Designing Configurable Publish-Subscribe Scheme for Decentralised Overlay Networks , 2007, 21st International Conference on Advanced Information Networking and Applications (AINA '07).

[8]  Michel F. Sanner,et al.  Services Oriented Architecture for Managing Workflows of Avian Flu Grid , 2008, 2008 IEEE Fourth International Conference on eScience.

[9]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[10]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[11]  Rajkumar Buyya,et al.  Peer-to-peer-based resource discovery in global grids: a tutorial , 2008, IEEE Communications Surveys & Tutorials.

[12]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[13]  Rajkumar Buyya,et al.  GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing , 2002, Concurr. Comput. Pract. Exp..

[14]  Pedro García López,et al.  PlanetSim: A New Overlay Network Simulation Framework , 2004, SEM.

[15]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[16]  Anthony Skjellum,et al.  Using MPI - portable parallel programming with the message-parsing interface , 1994 .

[17]  Jin-Soo Kim,et al.  Estimating Resource Needs for Time-Constrained Workflows , 2008, 2008 IEEE Fourth International Conference on eScience.

[18]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[19]  Zhen Li,et al.  Comet: a scalable coordination space for decentralized distributed environments , 2005, Second International Workshop on Hot Topics in Peer-to-Peer Systems.

[20]  Hai Jin,et al.  Peer-to-Peer Based Grid Workflow Runtime Environment of SwinDeW-G , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[21]  Zhou Lei,et al.  The portable batch scheduler and the maui scheduler on linux clusters , 2000 .

[22]  Radu Prodan,et al.  ASKALON: a tool set for cluster and Grid computing , 2005, Concurr. Pract. Exp..

[23]  Ben Y. Zhao,et al.  Towards a Common API for Structured Peer-to-Peer Overlays , 2003, IPTPS.

[24]  Rajesh Raman,et al.  Resource management through multilateral matchmaking , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[25]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[26]  Yolanda Gil,et al.  Pegasus: Mapping Scientific Workflows onto the Grid , 2004, European Across Grids Conference.

[27]  Larry L. Peterson,et al.  The design principles of PlanetLab , 2006, OPSR.

[28]  Nelson Da Fonseca,et al.  Peer-to-peer-based resource discovery in global grids: a tutorial , 2008, IEEE Communications Surveys & Tutorials.

[29]  Amin Vahdat,et al.  SHARP: an architecture for secure resource peering , 2003, SOSP '03.

[30]  Hong Linh Truong,et al.  ASKALON: a tool set for cluster and Grid computing: Research Articles , 2005 .

[31]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[32]  Rajkumar Buyya,et al.  Coordinated load management in Peer-to-Peer coupled federated grid systems , 2012, The Journal of Supercomputing.

[33]  Chen-Khong Tham,et al.  Decentralized Dynamic Workflow Scheduling for Grid Computing using Reinforcement Learning , 2006, 2006 14th IEEE International Conference on Networks.

[34]  Rajkumar Buyya,et al.  A novel architecture for realizing grid workflow using tuple spaces , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[35]  Wolfgang Gentzsch,et al.  Sun Grid Engine: towards creating a compute power grid , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[36]  Rajkumar Buyya,et al.  A case for cooperative and incentive-based federation of distributed clusters , 2008, Future Gener. Comput. Syst..

[37]  Jennifer M. Schopf,et al.  A performance study of monitoring and information services for distributed systems , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[38]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[39]  Anthony Skjellum,et al.  Using MPI: portable parallel programming with the message-passing interface, 2nd Edition , 1999, Scientific and engineering computation series.

[40]  Divyakant Agrawal,et al.  Meghdoot: Content-Based Publish/Subscribe over P2P Networks , 2004, Middleware.