Scheduling deadline constrained scientific workflows on dynamically provisioned cloud resources

Abstract Commercial cloud computing resources are rapidly becoming the target platform on which to perform scientific computation, due to the massive leverage possible and elastic pay-as-you-go pricing model. The cloud allows researchers and institutions to only provision compute when required, and to scale seamlessly as needed. The cloud computing paradigm therefore presents a low capital, low barrier to operating dedicated HPC eScience infrastructure. However, there are still significant technical hurdles associated with obtaining sufficient execution performance while limiting the financial cost, in particular, a naive scheduling algorithm may increase the cost of computation to the point that using cloud resources is no longer a viable option. The work in this article concentrates on the problem of scheduling deadline constrained scientific workloads on dynamically provisioned cloud resources, while reducing the cost of computation. Specifically we present two algorithms, Proportional Deadline Constrained (PDC) and Deadline Constrained Critical Path (DCCP) that address the workflow scheduling problem on such dynamically provisioned cloud resources. These algorithms are additionally extended to refine their operation in task prioritization and backfilling respectively. The results in this article indicate that both PDC and DCCP algorithms achieve higher cost efficiencies and success rates when compared to existing algorithms.

[1]  Kris Bubendorfer,et al.  Cost Effective and Deadline Constrained Scientific Workflow Scheduling for Commercial Clouds , 2015, 2015 IEEE 14th International Symposium on Network Computing and Applications.

[2]  Xiaorong Li,et al.  ScaleStar: Budget Conscious Scheduling Precedence-Constrained Many-task Workflow Applications in Cloud , 2012, 2012 IEEE 26th International Conference on Advanced Information Networking and Applications.

[3]  Jian Li,et al.  Cost-efficient task scheduling for executing large programs in the cloud , 2013, Parallel Comput..

[4]  Bryan Ng,et al.  Network health and e-Science in commercial clouds , 2016, Future Gener. Comput. Syst..

[5]  Ewa Deelman,et al.  Experiences using cloud computing for a scientific workflow application , 2011, ScienceCloud '11.

[6]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[7]  Edward A. Lee,et al.  A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures , 1993, IEEE Trans. Parallel Distributed Syst..

[8]  Y.-K. Kwok,et al.  Static scheduling algorithms for allocating directed task graphs to multiprocessors , 1999, CSUR.

[9]  Ewa Deelman,et al.  The cost of doing science on the cloud: the Montage example , 2008, HiPC 2008.

[10]  Dick H. J. Epema,et al.  Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds , 2013, Future Gener. Comput. Syst..

[11]  Jian Li,et al.  Cost-Conscious Scheduling for Large Graph Processing in the Cloud , 2011, 2011 IEEE International Conference on High Performance Computing and Communications.

[12]  Yuan Ying Bottom Level Based Heuristic for Workflow Scheduling in Grids , 2008 .

[13]  Ming Mao,et al.  A Performance Study on the VM Startup Time in the Cloud , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[14]  Sai Peck Lee,et al.  Cost-aware challenges for workflow scheduling approaches in cloud computing environments: Taxonomy and opportunities , 2015, Future Gener. Comput. Syst..

[15]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[16]  Ian T. Foster,et al.  Cost-Aware Cloud Provisioning , 2015, 2015 IEEE 11th International Conference on e-Science.

[17]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[18]  Deo Prakash Vidyarthi,et al.  A Cost-Effective Deadline-Constrained Dynamic Scheduling Algorithm for Scientific Workflows in a Cloud Environment , 2018, IEEE Transactions on Cloud Computing.

[19]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[20]  Ann L. Chervenak,et al.  Characterizing and profiling scientific workflows , 2013, Future Gener. Comput. Syst..

[21]  Rajkumar Buyya,et al.  Cost-based scheduling of scientific workflow applications on utility grids , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[22]  Minhaj Ahmad Khan,et al.  Scheduling for heterogeneous Systems using constrained critical paths , 2012, Parallel Comput..

[23]  Jeffrey D. Ullman,et al.  NP-Complete Scheduling Problems , 1975, J. Comput. Syst. Sci..

[24]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[25]  Jin-Soo Kim,et al.  Cost optimized provisioning of elastic resources for application workflows , 2011, Future Gener. Comput. Syst..

[26]  Yi Zhang,et al.  Bottom Level Based Heuristic for Workflow Scheduling in Grids: Bottom Level Based Heuristic for Workflow Scheduling in Grids , 2009 .

[27]  Alex Rodriguez,et al.  The Globus Galaxies platform: delivering science gateways as a service , 2015, Concurr. Comput. Pract. Exp..

[28]  G. Bruce Berriman,et al.  On the Use of Cloud Computing for Scientific Workflows , 2008, 2008 IEEE Fourth International Conference on eScience.

[29]  Ishfaq Ahmad,et al.  Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors , 1996, IEEE Trans. Parallel Distributed Syst..

[30]  G. Bruce Berriman,et al.  Scientific workflow applications on Amazon EC2 , 2010, 2009 5th IEEE International Conference on E-Science Workshops.

[31]  Sucha Smanchat,et al.  Taxonomies of workflow scheduling problem and techniques in the cloud , 2015, Future Gener. Comput. Syst..

[32]  Matei Ripeanu,et al.  Amazon S3 for science grids: a viable solution? , 2008, DADC '08.

[33]  Rajkumar Buyya,et al.  Meeting Deadlines of Scientific Workflows in Public Clouds with Tasks Replication , 2014, IEEE Transactions on Parallel and Distributed Systems.

[34]  Yun Yang,et al.  Robust Scheduling of Scientific Workflows with Deadline and Budget Constraints in Clouds , 2014, 2014 IEEE 28th International Conference on Advanced Information Networking and Applications.

[35]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[36]  Bryan Ng,et al.  A Deadline Constrained Critical Path Heuristic for Cost-Effectively Scheduling Workflows , 2015, 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC).

[37]  Xiaoping Li,et al.  Deadline division-based heuristic for cost optimization in workflow scheduling , 2009, Inf. Sci..

[38]  Rizos Sakellariou,et al.  Budget-Deadline Constrained Workflow Planning for Admission Control , 2013, Journal of Grid Computing.

[39]  Kyle Chard,et al.  High occupancy resource allocation for grid and cloud systems, a study with DRIVE , 2010, HPDC '10.

[40]  Luiz Fernando Bittencourt,et al.  HCOC: a cost optimization algorithm for workflow scheduling in hybrid clouds , 2011, Journal of Internet Services and Applications.

[41]  Marios D. Dikaiakos,et al.  Scheduling Workflows with Budget Constraints , 2007, Grid 2007.

[42]  Qingbo Wu,et al.  Workflow scheduling in cloud: a survey , 2015, The Journal of Supercomputing.