Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds

Large-scale applications expressed as scientific workflows are often grouped into ensembles of inter-related workflows. In this paper, we address a new and important problem concerning the efficient management of such ensembles under budget and deadline constraints on Infrastructure- as-aService (IaaS) clouds. We discuss, develop, and assess algorithms based on static and dynamic strategies for both task scheduling and resource provisioning. We perform the evaluation via simulation using a set of scientific workflow ensembles with a broad range of budget and deadline parameters, taking into account uncertainties in task runtime estimations, provisioning delays, and failures. We find that the key factor determining the performance of an algorithm is its ability to decide which workflows in an ensemble to admit or reject for execution. Our results show that an admission procedure based on workflow structure and estimates of task runtimes can significantly improve the quality of solutions.

[1]  Radu Prodan,et al.  Multi-objective Workflow Scheduling: An Analysis of the Energy Efficiency and Makespan Tradeoff , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[2]  Andrei Tchernykh,et al.  Multiple Workflow Scheduling Strategies with User Run Time Estimates on a Grid , 2012, Journal of Grid Computing.

[3]  Rizos Sakellariou,et al.  Stochastic DAG scheduling using a Monte Carlo approach , 2013, J. Parallel Distributed Comput..

[4]  Jan Broeckhove,et al.  Online cost-efficient scheduling of deadline-constrained workloads on hybrid clouds , 2013, Future Gener. Comput. Syst..

[5]  Paul Marshall,et al.  Elastic Site: Using Clouds to Elastically Extend Site Resources , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[6]  Rizos Sakellariou,et al.  Scheduling multiple DAGs onto heterogeneous systems , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[7]  Xiaorong Li,et al.  Multi-Objective Game Theoretic Schedulingof Bag-of-Tasks Workflows on Hybrid Clouds , 2014, IEEE Transactions on Cloud Computing.

[8]  Ewa Deelman,et al.  Experiences using cloud computing for a scientific workflow application , 2011, ScienceCloud '11.

[9]  Emmanuel Jeannot,et al.  Bi-objective scheduling algorithms for optimizing makespan and reliability on heterogeneous systems , 2007, SPAA '07.

[10]  Igor Sfiligoi,et al.  glideinWMS - A generic pilot-based Workload Management System , 2008 .

[11]  Radu Prodan,et al.  Bi-Criteria Scheduling of Scientific Grid Workflows , 2010, IEEE Transactions on Automation Science and Engineering.

[12]  J. Hules National Energy Research Scientific Computing Center (NERSC): Advancing the frontiers of computational science and technology , 1996 .

[13]  Radu Prodan,et al.  Multi-objective workflow scheduling in Amazon EC2 , 2014, Cluster Computing.

[14]  ArabnejadHamid,et al.  List Scheduling Algorithm for Heterogeneous Systems by an Optimistic Cost Table , 2014 .

[15]  Shantenu Jha,et al.  Autonomic management of application workflows on hybrid computing infrastructure , 2011, Sci. Program..

[16]  Jarek Nabrzyski,et al.  Cost minimization for computational applications on hybrid cloud infrastructures , 2013, Future Gener. Comput. Syst..

[17]  Dave Durkee,et al.  Why Cloud Computing Will Never Be Free , 2010, ACM Queue.

[18]  Radu Prodan,et al.  Towards a general model of the multi-criteria workflow scheduling on the grid , 2009, Future Gener. Comput. Syst..

[19]  Ewa Deelman,et al.  Automating Application Deployment in Infrastructure Clouds , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[20]  Dick H. J. Epema,et al.  Cost-driven scheduling of grid workflows using Partial Critical Paths , 2010, 2010 11th IEEE/ACM International Conference on Grid Computing.

[21]  Ewa Deelman,et al.  Community Resources for Enabling Research in Distributed Scientific Workflows , 2014, 2014 IEEE 10th International Conference on e-Science.

[22]  VanmechelenKurt,et al.  Online cost-efficient scheduling of deadline-constrained workloads on hybrid clouds , 2013 .

[23]  G. Bruce Berriman,et al.  Data Sharing Options for Scientific Workflows on Amazon EC2 , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[24]  Rizos Sakellariou,et al.  Budget-Deadline Constrained Workflow Planning for Admission Control , 2013, Journal of Grid Computing.

[25]  Marian Bubak,et al.  Cost Optimization of Execution of Multi-level Deadline-Constrained Scientific Workflows on Clouds , 2013, PPAM.

[26]  ParasharManish,et al.  Autonomic management of application workflows on hybrid computing infrastructure , 2011 .

[27]  Rajkumar Buyya,et al.  Multiobjective differential evolution for scheduling workflow applications on global Grids , 2009, Concurr. Comput. Pract. Exp..

[28]  Luiz Fernando Bittencourt,et al.  HCOC: a cost optimization algorithm for workflow scheduling in hybrid clouds , 2011, Journal of Internet Services and Applications.

[29]  Marios D. Dikaiakos,et al.  Scheduling Workflows with Budget Constraints , 2007, Grid 2007.

[30]  Ewa Deelman,et al.  Fault Tolerant Clustering in Scientific Workflows , 2012, 2012 IEEE Eighth World Congress on Services.

[31]  Luiz Fernando Bittencourt,et al.  Using Time Discretization to Schedule Scientific Workflows in Multiple Cloud Providers , 2013, 2013 IEEE Sixth International Conference on Cloud Computing.

[32]  Jan Broeckhove,et al.  Cost-Optimal Scheduling in Hybrid IaaS Clouds for Deadline Constrained Workloads , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[33]  AbrishamiSaeid,et al.  Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds , 2013 .

[34]  Albert Y. Zomaya,et al.  Stretch Out and Compact: Workflow Scheduling with Resource Abundance , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[35]  Rajkumar Buyya,et al.  A Particle Swarm Optimization-Based Heuristic for Scheduling Workflow Applications in Cloud Computing Environments , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[36]  Rajkumar Buyya,et al.  Cost-based scheduling of scientific workflow applications on utility grids , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[37]  José A. B. Fortes,et al.  Sky Computing , 2009, IEEE Internet Computing.

[38]  Hamid Arabnejad,et al.  List Scheduling Algorithm for Heterogeneous Systems by an Optimistic Cost Table , 2014, IEEE Transactions on Parallel and Distributed Systems.

[39]  Ewa Deelman,et al.  The cost of doing science on the cloud: the Montage example , 2008, HiPC 2008.

[40]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[41]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[42]  Dmitrii Zagorodnov,et al.  Eucalyptus : A Technical Report on an Elastic Utility Computing Archietcture Linking Your Programs to Useful Systems , 2008 .

[43]  Marty Humphrey,et al.  Auto-scaling to minimize cost and meet application deadlines in cloud workflows , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[44]  Xiaorong Li,et al.  A Sequential Cooperative Game Theoretic Approach to Storage-Aware Scheduling of Multiple Large-Scale Workflow Applications in Grids , 2012, 2012 ACM/IEEE 13th International Conference on Grid Computing.

[45]  Albert Y. Zomaya,et al.  Tradeoffs Between Profit and Customer Satisfaction for Service Provisioning in the Cloud , 2011, HPDC '11.

[46]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[47]  Omer F. Rana,et al.  Enforcing QoS in scientific workflow systems enacted over Cloud infrastructures , 2012, J. Comput. Syst. Sci..

[48]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[49]  Radu Prodan,et al.  MOHEFT: A multi-objective list-based method for workflow scheduling , 2012, 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings.

[50]  Keith Beattie,et al.  Metrics for heterogeneous scientific workflows: A case study of an earthquake science application , 2011, Int. J. High Perform. Comput. Appl..

[51]  G. Bruce Berriman,et al.  Scientific workflow applications on Amazon EC2 , 2010, 2009 5th IEEE International Conference on E-Science Workshops.

[52]  Lauren Wood 技術解説 IEEE Internet Computing , 1999 .

[53]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..