Elastic Scheduling of Scientific Workflows under Deadline Constraints in Cloud Computing Environments

Scientific workflow applications are collections of several structured activities and fine-grained computational tasks. Scientific workflow scheduling in cloud computing is a challenging research topic due to its distinctive features. In cloud environments, it has become critical to perform efficient task scheduling resulting in reduced scheduling overhead, minimized cost and maximized resource utilization while still meeting the user-specified overall deadline. This paper proposes a strategy, Dynamic Scheduling of Bag of Tasks based workflows (DSB), for scheduling scientific workflows with the aim to minimize financial cost of leasing Virtual Machines (VMs) under a user-defined deadline constraint. The proposed model groups the workflow into Bag of Tasks (BoTs) based on data dependency and priority constraints and thereafter optimizes the allocation and scheduling of BoTs on elastic, heterogeneous and dynamically provisioned cloud resources called VMs in order to attain the proposed method’s objectives. The proposed approach considers pay-as-you-go Infrastructure as a Service (IaaS) clouds having inherent features such as elasticity, abundance, heterogeneity and VM provisioning delays. A trace-based simulation using benchmark scientific workflows representing real world applications, demonstrates a significant reduction in workflow computation cost while the workflow deadline is met. The results validate that the proposed model produces better success rates to meet deadlines and cost efficiencies in comparison to adapted state-of-the-art algorithms for similar problems.

[1]  Maciej Malawski,et al.  Adaptive Multi-level Workflow Scheduling with Uncertain Task Estimates , 2015, PPAM.

[2]  Xiaohui Liu,et al.  Evolutionary Multi-Objective Workflow Scheduling in Cloud , 2016, IEEE Transactions on Parallel and Distributed Systems.

[3]  Rajkumar Buyya,et al.  A taxonomy and survey on scheduling algorithms for scientific workflows in IaaS cloud computing environments , 2017, Concurr. Comput. Pract. Exp..

[4]  Rizos Sakellariou,et al.  Using imbalance metrics to optimize task clustering in scientific workflow executions , 2015, Future Gener. Comput. Syst..

[5]  Rajkumar Buyya,et al.  Budget-Driven Scheduling of Scientific Workflows in IaaS Clouds with Fine-Grained Billing Periods , 2017, ACM Trans. Auton. Adapt. Syst..

[6]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[7]  Kenli Li,et al.  An optimized MapReduce workflow scheduling algorithm for heterogeneous computing , 2016, The Journal of Supercomputing.

[8]  Miron Livny,et al.  Pegasus, a workflow management system for science automation , 2015, Future Gener. Comput. Syst..

[9]  Rajkumar Buyya,et al.  Task granularity policies for deploying bag-of-task applications on global grids , 2013, Future Gener. Comput. Syst..

[10]  Moustafa Ghanem,et al.  Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support , 2012, BMC Bioinformatics.

[11]  Dick H. J. Epema,et al.  Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds , 2013, Future Gener. Comput. Syst..

[12]  Jin-Soo Kim,et al.  Cost optimized provisioning of elastic resources for application workflows , 2011, Future Gener. Comput. Syst..

[13]  O. Geoffrey Okogbaa,et al.  Regression and ANOVA: An Integrated Approach Using SAS Software , 2004 .

[14]  Ewa Deelman,et al.  WorkflowSim: A toolkit for simulating scientific workflows in distributed environments , 2012, 2012 IEEE 8th International Conference on E-Science.

[15]  Ewa Deelman,et al.  Workflow overhead analysis and optimizations , 2011, WORKS '11.

[16]  Danny Dolev,et al.  Extensible Architecture for High-Performance, Scalable, Reliable Publish-Subscribe Eventing and Notification , 2007, Int. J. Web Serv. Res..

[17]  Ann L. Chervenak,et al.  Characterizing and profiling scientific workflows , 2013, Future Gener. Comput. Syst..

[18]  Xiaoping Li,et al.  Resource Provisioning for Task-Batch Based Workflows with Deadlines in Public Clouds , 2019, IEEE Transactions on Cloud Computing.

[19]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[20]  Jeffrey D. Ullman,et al.  NP-Complete Scheduling Problems , 1975, J. Comput. Syst. Sci..

[21]  Helen D. Karatza,et al.  Multi-criteria scheduling of Bag-of-Tasks applications on heterogeneous interlinked clouds with simulated annealing , 2015, J. Syst. Softw..

[22]  Tristan Glatard,et al.  On-Line, Non-clairvoyant Optimization of Workflow Activity Granularity on Grids , 2013, Euro-Par.

[23]  Sakshi Kaushal,et al.  Cost-Time Efficient Scheduling Plan for Executing Workflows in the Cloud , 2015, Journal of Grid Computing.

[24]  Wei Tan,et al.  Self-Adaptive Learning PSO-Based Deadline Constrained Task Scheduling for Hybrid IaaS Cloud , 2014, IEEE Transactions on Automation Science and Engineering.

[25]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[26]  David M Levinson,et al.  Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering , 2009, Complex.

[27]  Marty Humphrey,et al.  Auto-scaling to minimize cost and meet application deadlines in cloud workflows , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[28]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[29]  Rajkumar Buyya,et al.  Scheduling dynamic workloads in multi-tenant scientific workflow as a service platforms , 2018, Future Gener. Comput. Syst..

[30]  Radu Prodan,et al.  Low-time complexity budget-deadline constrained workflow scheduling on heterogeneous resources , 2016, Future Gener. Comput. Syst..

[31]  John Shalf,et al.  Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[32]  Rajkumar Buyya,et al.  A Responsive Knapsack-Based Algorithm for Resource Provisioning and Scheduling of Scientific Workflows in Clouds , 2015, 2015 44th International Conference on Parallel Processing.

[33]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[34]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[35]  Prasanta K. Jana,et al.  A novel cost-efficient approach for deadline-constrained workflow scheduling by dynamic provisioning of resources , 2018, Future Gener. Comput. Syst..