A delay-based dynamic scheduling algorithm for bag-of-task workflows with stochastic task execution times in clouds

Bag-of-Tasks (BoT) workflows are widespread in many big data analysis fields. However, there are very few cloud resource provisioning and scheduling algorithms tailored for BoT workflows. Furthermore, existing algorithms fail to consider the stochastic task execution times of BoT workflows which leads to deadline violations and increased resource renting costs. In this paper, we propose a dynamic cloud resource provisioning and scheduling algorithm which aims to fulfill the workflow deadline by using the sum of task execution time expectation and standard deviation to estimate real task execution times. A bag-based delay scheduling strategy and a single-type based virtual machine interval renting method are presented to decrease the resource renting cost. The proposed algorithm is evaluated using a cloud simulator ElasticSim which is extended from CloudSim. The results show that the dynamic algorithm decreases the resource renting cost while guaranteeing the workflow deadline compared to the existing algorithms. Minimizing the cloud resource renting cost of bag-of-tasks workflows.A bag-based delay triggering strategy is proposed to fully use the bag structure.Using expectation-and-variance of execution times to estimate practical times.A single-type based greedy method is developed for each ready BoT.

[1]  Kwang Mong Sim,et al.  Agent-based Cloud bag-of-tasks execution , 2015, J. Syst. Softw..

[2]  Radu Prodan,et al.  A Hybrid Intelligent Method for Performance Modeling and Prediction of Workflow Activities in Grids , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[3]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[4]  Rizos Sakellariou,et al.  Using imbalance metrics to optimize task clustering in scientific workflow executions , 2015, Future Gener. Comput. Syst..

[5]  Yang Wang,et al.  Budget-Driven Scheduling Algorithms for Batches of MapReduce Jobs in Heterogeneous Clouds , 2014, IEEE Transactions on Cloud Computing.

[6]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[7]  Marta Mattoso,et al.  A Survey of Data-Intensive Scientific Workflow Management , 2015, Journal of Grid Computing.

[8]  Sakshi Kaushal,et al.  Cost-Time Efficient Scheduling Plan for Executing Workflows in the Cloud , 2015, Journal of Grid Computing.

[9]  Mohamed Othman,et al.  Energy aware resource allocation of cloud data center: review and open issues , 2016, Cluster Computing.

[10]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[11]  Xiaoping Li,et al.  Elastic Resource Provisioning for Cloud Workflow Applications , 2017, IEEE Transactions on Automation Science and Engineering.

[12]  Soo-Young Lee,et al.  A stochastic approach to estimating earliest start times of nodes for scheduling DAGs on heterogeneous distributed computing systems , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[13]  Kenli Li,et al.  An optimized MapReduce workflow scheduling algorithm for heterogeneous computing , 2016, The Journal of Supercomputing.

[14]  Bruno Schulze,et al.  An Analysis of Public Clouds Elasticity in the Execution of Scientific Applications: a Survey , 2016, Journal of Grid Computing.

[15]  Wei Tan,et al.  Self-Adaptive Learning PSO-Based Deadline Constrained Task Scheduling for Hybrid IaaS Cloud , 2014, IEEE Transactions on Automation Science and Engineering.

[16]  Isabelle Puaut,et al.  Static determination of probabilistic execution times , 2004, Proceedings. 16th Euromicro Conference on Real-Time Systems, 2004. ECRTS 2004..

[17]  Laura Carrington,et al.  A performance prediction framework for scientific applications , 2003, Future Gener. Comput. Syst..

[18]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[19]  Xiaoping Li,et al.  Resource Provisioning for Task-Batch Based Workflows with Deadlines in Public Clouds , 2019, IEEE Transactions on Cloud Computing.

[20]  Jian Li,et al.  Cost-efficient task scheduling for executing large programs in the cloud , 2013, Parallel Comput..

[21]  Prabuddha De,et al.  Complexity of the Discrete Time-Cost Tradeoff Problem for Project Networks , 1997, Oper. Res..

[22]  Alexey Lastovetsky,et al.  Towards a Realistic Performance Model for Networks of Heterogeneous Computers , 2005 .

[23]  Jatinder N. D. Gupta,et al.  Heuristics for Provisioning Services to Workflows in XaaS Clouds , 2016, IEEE Transactions on Services Computing.

[24]  Inderveer Chana,et al.  A Survey on Resource Scheduling in Cloud Computing: Issues and Challenges , 2016, Journal of Grid Computing.

[25]  Shiyong Lu,et al.  A MapReduce-Enabled Scientific Workflow Composition Framework , 2009, 2009 IEEE International Conference on Web Services.

[26]  Kenli Li,et al.  A stochastic scheduling algorithm for precedence constrained tasks on Grid , 2011, Future Gener. Comput. Syst..

[27]  Sandeep K. Sood,et al.  Scheduling of big data applications on distributed cloud based on QoS parameters , 2014, Cluster Computing.

[28]  Dick H. J. Epema,et al.  Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds , 2013, Future Gener. Comput. Syst..

[29]  Martin Skutella,et al.  Stochastic Machine Scheduling with Precedence Constraints , 2005, SIAM J. Comput..

[30]  Rolf H. Möhring,et al.  Minimizing Costs of Resource Requirements in Project Networks Subject to a Fixed Completion Time , 1984, Oper. Res..

[31]  Jin-Soo Kim,et al.  BTS: Resource capacity estimate for time-targeted science workflows , 2011, J. Parallel Distributed Comput..

[32]  Fang Dong,et al.  Elastic resource provisioning for scientific workflow scheduling in cloud under budget and deadline constraints , 2016, Cluster Computing.

[33]  Radu Prodan,et al.  Multi-objective workflow scheduling in Amazon EC2 , 2014, Cluster Computing.

[34]  Jin-Soo Kim,et al.  Cost optimized provisioning of elastic resources for application workflows , 2011, Future Gener. Comput. Syst..

[35]  Lee C. Potter,et al.  Statistical Prediction of Task Execution Times through Analytic Benchmarking for Scheduling in a Heterogeneous Environment , 1999, IEEE Trans. Computers.

[36]  Erik Demeulemeester,et al.  New computational results on the discrete time/cost trade-off problem in project networks , 1998, J. Oper. Res. Soc..

[37]  Inderveer Chana,et al.  Energy aware scheduling of deadline-constrained tasks in cloud computing , 2016, Cluster Computing.

[38]  Thomas Bartz-Beielstein,et al.  Experimental Methods for the Analysis of Optimization Algorithms , 2010 .

[39]  Rajkumar Buyya,et al.  Deadline Based Resource Provisioningand Scheduling Algorithm for Scientific Workflows on Clouds , 2014, IEEE Transactions on Cloud Computing.

[40]  Xiaoping Li,et al.  ElasticSim: A Toolkit for Simulating Workflows with Cloud Resource Runtime Auto-Scaling and Stochastic Task Execution Times , 2017, Journal of Grid Computing.

[41]  Rizos Sakellariou,et al.  Stochastic DAG scheduling using a Monte Carlo approach , 2013, J. Parallel Distributed Comput..

[42]  Nicola Cordeschi,et al.  FUGE: A joint meta-heuristic approach to cloud job scheduling algorithm using fuzzy theory and a genetic method , 2014, Cluster Computing.

[43]  Cyriel Rutten,et al.  Performance guarantees of jump neighborhoods on restricted related parallel machines , 2012, Oper. Res. Lett..

[44]  Helen D. Karatza,et al.  Multi-criteria scheduling of Bag-of-Tasks applications on heterogeneous interlinked clouds with simulated annealing , 2015, J. Syst. Softw..

[45]  Qi Li,et al.  Image degradation and recovery based on multiple scattering in remote sensing and bad weather condition , 2012 .

[46]  Dharma P. Agrawal,et al.  Improving scheduling of tasks in a heterogeneous environment , 2004, IEEE Transactions on Parallel and Distributed Systems.

[47]  Juan Li,et al.  Editorial: A special section on “Emerging Platform Technologies” , 2015, The Journal of Supercomputing.

[48]  Jirí Sgall,et al.  Approximation Schemes for Scheduling on Uniformly Related and Identical Parallel Machines , 1999, ESA.

[49]  Jan Broeckhove,et al.  Online cost-efficient scheduling of deadline-constrained workloads on hybrid clouds , 2013, Future Gener. Comput. Syst..

[50]  E.L. Lawler,et al.  Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey , 1977 .