Deadline-sensitive workflow orchestration without explicit resource control

Deadline-sensitive workflows require careful coordination of user constraints with resource availability. Current distributed resource access models provide varying degrees of resource control: from limited or none in grid batch systems to explicit in cloud systems. Additionally applications experience variability due to competing user loads, performance variations, failures, etc. These variations impact the quality of service (QoS) that goes unaccounted for in planning strategies. In this paper we propose Workflow ORchestrator for Distributed Systems (WORDS) architecture based on a least common denominator resource model that abstracts the differences and captures the QoS properties provided by grid and cloud systems. We investigate algorithms for effective orchestration (i.e., resource procurement and task mapping) for deadline-sensitive workflows atop the resource abstraction provided in WORDS. Our evaluation compares orchestration methodologies over TeraGrid and Amazon EC2 systems. Experimental results show that WORDS enables effective orchestration possible at reasonable costs on batch queue grid and cloud systems with or without explicit resource control.

[1]  Marios D. Dikaiakos,et al.  Scheduling Workflows with Budget Constraints , 2007, Grid 2007.

[2]  Dmitrii Zagorodnov,et al.  Eucalyptus : A Technical Report on an Elastic Utility Computing Archietcture Linking Your Programs to Useful Systems , 2008 .

[3]  Mark J. Clement,et al.  The Performance Impact of Advance Reservation Meta-scheduling , 2000, JSSPP.

[4]  Ian Foster,et al.  A quality of service architecture that combines resource reservation and application adaptation , 2000, 2000 Eighth International Workshop on Quality of Service. IWQoS 2000 (Cat. No.00EX400).

[5]  Miron Livny,et al.  The cost of doing science on the cloud: The Montage example , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[6]  G. E. Radke,et al.  A fast recursive algorithm to compute the probability of M-out-of-N events , 1994, Proceedings of Annual Reliability and Maintainability Symposium (RAMS).

[7]  David E. Irwin,et al.  Sharing Networked Resources with Brokered Leases , 2006, USENIX Annual Technical Conference, General Track.

[8]  Carl Kesselman,et al.  Application-Level Resource Provisioning on the Grid , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[9]  Rajkumar Buyya,et al.  Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms , 2006, Sci. Program..

[10]  Lavanya Ramakrishnan,et al.  Predictable quality of service atop degradable distributed systems , 2009, Cluster Computing.

[11]  Jeffrey S. Chase,et al.  Extensible resource management for networked virtual computing , 2007 .

[12]  Grzegorz Malewicz,et al.  Parallel scheduling of complex dags under uncertainty , 2005, SPAA '05.

[13]  Ken Kennedy,et al.  Scheduling strategies for mapping application workflows onto the grid , 2005, HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005..

[14]  Richard Wolski,et al.  QBETS: queue bounds estimation from time series , 2007, SIGMETRICS '07.

[15]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[16]  K. Kennedy,et al.  Evaluation of a Workflow Scheduler Using Integrated Performance Modelling and Batch Queue Wait Time Prediction , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[17]  Lavanya Ramakrishnan,et al.  A Survey of Distributed Workflow Characteristics and Resource Requirements , 2008 .

[18]  Richard Wolski,et al.  VARQ: virtual advance reservations for queues , 2008, HPDC '08.

[19]  Francine Berman,et al.  Toward a framework for preparing and executing adaptive grid programs , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[20]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[21]  Ian T. Foster,et al.  Agreement-Based Resource Management , 2005, Proceedings of the IEEE.

[22]  Yolanda Gil,et al.  Artificial intelligence and grids: workflow planning and beyond , 2004, IEEE Intelligent Systems.

[23]  Rahul Ramachandran,et al.  Service-oriented environments for dynamically interacting with mesoscale weather , 2005, Computing in Science & Engineering.

[24]  Warren Smith,et al.  Scheduling with advanced reservations , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[25]  Ladislau Bölöni,et al.  A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..

[26]  Andrew A. Chien,et al.  Scalable Grid Application Scheduling via Decoupled Resource Selection and Scheduling , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[27]  Andrew A. Chien,et al.  Efficient resource description and high quality selection for virtual grids , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[28]  Ian T. Foster,et al.  SNAP: A Protocol for Negotiating Service Level Agreements and Coordinating Resource Management in Distributed Systems , 2002, JSSPP.