Elastic resource provisioning for scientific workflow scheduling in cloud under budget and deadline constraints

With the popularization and development of cloud computing, lots of scientific computing applications are conducted in cloud environments. However, current application scenario of scientific computing is also becoming increasingly dynamic and complicated, such as unpredictable submission times of jobs, different priorities of jobs, deadlines and budget constraints of executing jobs. Thus, how to perform scientific computing efficiently in cloud has become an urgent problem. To address this problem, we design an elastic resource provisioning and task scheduling mechanism to perform scientific workflow jobs in cloud. The goal of this mechanism is to complete as many high-priority workflow jobs as possible under budget and deadline constraints. This mechanism consists of four steps: job preprocessing, job admission control, elastic resource provisioning and task scheduling. We perform the evaluation with four kinds of real scientific workflow jobs under different budget constraints. We also consider the uncertainties of task runtime estimations, provisioning delays, and failures in evaluation. The results show that in most cases our mechanism achieves a better performance than other mechanisms. In addition, the uncertainties of task runtime estimations, VM provisioning delays, and task failures do not have major impact on the mechanism’s performance.

[1]  Cheng Wu,et al.  Concurrent and storage-aware data streaming for data processing workflows in grid environments , 2010 .

[2]  Dick H. J. Epema,et al.  Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds , 2013, Future Gener. Comput. Syst..

[3]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[4]  Albert Y. Zomaya,et al.  Stretch Out and Compact: Workflow Scheduling with Resource Abundance , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[5]  Soo-Young Lee,et al.  A Stochastic Approach to Estimating Earliest Start Times of Nodes for Scheduling DAGs on Heterogeneous Distributed Computing Systems , 2005, IPDPS.

[6]  Marty Humphrey,et al.  Auto-scaling to minimize cost and meet application deadlines in cloud workflows , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[7]  K. Steinmetz Bin packing problem , 2013 .

[8]  Lijuan Wang,et al.  Facilitating an ant colony algorithm for multi-objective data-intensive service provision , 2015, J. Comput. Syst. Sci..

[9]  F. Berman,et al.  NewGrid Scheduling 1 and ReschedulingMethods 2 in the GrADS Project , 2005 .

[10]  Radu Prodan,et al.  Multi-objective workflow scheduling in Amazon EC2 , 2014, Cluster Computing.

[11]  Jin-Soo Kim,et al.  Cost optimized provisioning of elastic resources for application workflows , 2011, Future Gener. Comput. Syst..

[12]  Marty Humphrey,et al.  Scaling and Scheduling to Maximize Application Performance within Budget Constraints in Cloud Workflows , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[13]  Chase Qishi Wu,et al.  On Scientific Workflow Scheduling in Clouds under Budget Constraint , 2013, 2013 42nd International Conference on Parallel Processing.

[14]  Albert Y. Zomaya,et al.  Adaptive multiple-workflow scheduling with task rearrangement , 2014, The Journal of Supercomputing.

[15]  Rajkumar Buyya,et al.  Meeting Deadlines of Scientific Workflows in Public Clouds with Tasks Replication , 2014, IEEE Transactions on Parallel and Distributed Systems.

[16]  Fang Dong,et al.  A budget and deadline aware scientific workflow resource provisioning and scheduling mechanism for cloud , 2014, Proceedings of the 2014 IEEE 18th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[17]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  Xingfu Wu,et al.  Using Performance Prediction to Allocate Grid Resources , 2004 .

[19]  Jeff Edmonds,et al.  How to Think About Algorithms: Dynamic Programming Algorithms , 2008 .

[20]  Marios D. Dikaiakos,et al.  Scheduling Workflows with Budget Constraints , 2007, Grid 2007.

[21]  Ciprian Dobre,et al.  MOMTH: multi-objective scheduling algorithm of many tasks in Hadoop , 2015, Cluster Computing.

[22]  Arash Ghorbannia Delavar,et al.  HSGA: a hybrid heuristic algorithm for workflow scheduling in cloud systems , 2013, Cluster Computing.

[23]  Radu Prodan,et al.  Multi-objective list scheduling of workflow applications in distributed computing infrastructures , 2014, J. Parallel Distributed Comput..

[24]  Dick H. J. Epema,et al.  Cost-driven scheduling of grid workflows using Partial Critical Paths , 2010, 2010 11th IEEE/ACM International Conference on Grid Computing.

[25]  Rajkumar Buyya,et al.  Cost-based scheduling of scientific workflow applications on utility grids , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[26]  Francine Berman,et al.  New Grid Scheduling and Rescheduling Methods in the GrADS Project , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[27]  Albert Y. Zomaya,et al.  Resource-efficient workflow scheduling in clouds , 2015, Knowl. Based Syst..

[28]  Luke M. Leslie,et al.  Handling Uncertainty: Pareto-Efficient BoT Scheduling on Hybrid Clouds , 2013, 2013 42nd International Conference on Parallel Processing.