Cost Effective Genetic Algorithm for Workflow Scheduling in Cloud Under Deadline Constraint

Cloud computing is becoming an increasingly admired paradigm that delivers high-performance computing resources over the Internet to solve the large-scale scientific problems, but still it has various challenges that need to be addressed to execute scientific workflows. The existing research mainly focused on minimizing finishing time (makespan) or minimization of cost while meeting the quality of service requirements. However, most of them do not consider essential characteristic of cloud and major issues, such as virtual machines (VMs) performance variation and acquisition delay. In this paper, we propose a meta-heuristic cost effective genetic algorithm that minimizes the execution cost of the workflow while meeting the deadline in cloud computing environment. We develop novel schemes for encoding, population initialization, crossover, and mutations operators of genetic algorithm. Our proposal considers all the essential characteristics of the cloud as well as VM performance variation and acquisition delay. Performance evaluation on some well-known scientific workflows, such as Montage, LIGO, CyberShake, and Epigenomics of different size exhibits that our proposed algorithm performs better than the current state-of-the-art algorithms.

[1]  Ritu Garg,et al.  Multi-objective workflow grid scheduling using $$\varepsilon $$ε-fuzzy dominance sort based discrete particle swarm optimization , 2014, The Journal of Supercomputing.

[2]  Ian J. Taylor,et al.  Workflows and e-Science: An overview of workflow system features and capabilities , 2009, Future Gener. Comput. Syst..

[3]  John Shalf,et al.  Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[4]  Rajkumar Buyya,et al.  Meeting Deadlines of Scientific Workflows in Public Clouds with Tasks Replication , 2014, IEEE Transactions on Parallel and Distributed Systems.

[5]  Ann L. Chervenak,et al.  Characterizing and profiling scientific workflows , 2013, Future Gener. Comput. Syst..

[6]  Rajkumar Buyya,et al.  Article in Press Future Generation Computer Systems ( ) – Future Generation Computer Systems Cloud Computing and Emerging It Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility , 2022 .

[7]  Yun Yang,et al.  Robust Scheduling of Scientific Workflows with Deadline and Budget Constraints in Clouds , 2014, 2014 IEEE 28th International Conference on Advanced Information Networking and Applications.

[8]  Xiaoping Li,et al.  Deadline division-based heuristic for cost optimization in workflow scheduling , 2009, Inf. Sci..

[9]  Sai Peck Lee,et al.  Cost-aware challenges for workflow scheduling approaches in cloud computing environments: Taxonomy and opportunities , 2015, Future Gener. Comput. Syst..

[10]  Rajkumar Buyya,et al.  Deadline Based Resource Provisioningand Scheduling Algorithm for Scientific Workflows on Clouds , 2014, IEEE Transactions on Cloud Computing.

[11]  Jun Zhang,et al.  An Ant Colony Optimization Approach to a Grid Workflow Scheduling Problem With Various QoS Requirements , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[12]  Hao Wu,et al.  Resource and Instance Hour Minimization for Deadline Constrained DAG Applications Using Computer Clouds , 2016, IEEE Transactions on Parallel and Distributed Systems.

[13]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[14]  Rajkumar Buyya,et al.  Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms , 2006, Sci. Program..

[15]  Junwei Cao,et al.  A Case Study on the Use of Workflow Technologies for Scientific Analysis: Gravitational Wave Data Analysis , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[16]  Rajkumar Buyya,et al.  Cost-based scheduling of scientific workflow applications on utility grids , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[17]  Dick H. J. Epema,et al.  Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds , 2013, Future Gener. Comput. Syst..

[18]  Radu Prodan,et al.  Bi-Criteria Scheduling of Scientific Grid Workflows , 2010, IEEE Transactions on Automation Science and Engineering.

[19]  Manu Vardhan,et al.  Efficient Utilization of Commodity Computers in Academic Institutes: A Cloud Computing Approach , 2015 .

[20]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[21]  Jun Zhang,et al.  Deadline constrained cloud computing resources scheduling for cost optimization based on dynamic objective genetic algorithm , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[22]  Rajkumar Buyya,et al.  A Particle Swarm Optimization-Based Heuristic for Scheduling Workflow Applications in Cloud Computing Environments , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[23]  Marty Humphrey,et al.  Auto-scaling to minimize cost and meet application deadlines in cloud workflows , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[24]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[25]  Ali Afzal,et al.  QoS-Constrained Stochastic Workflow Scheduling in Enterprise and Scientific Grids , 2006, GRID.

[26]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[27]  Darrell Whitley,et al.  A genetic algorithm tutorial , 1994, Statistics and Computing.

[28]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[29]  Francine Berman,et al.  New Grid Scheduling and Rescheduling Methods in the GrADS Project , 2004, IPDPS Next Generation Software Program - NSFNGS - PI Workshop.

[30]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[31]  Oscar H. Ibarra,et al.  Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors , 1977, JACM.

[32]  Radu Prodan,et al.  Performance and cost optimization for multiple large-scale grid workflow applications , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[33]  Sucha Smanchat,et al.  Taxonomies of workflow scheduling problem and techniques in the cloud , 2015, Future Gener. Comput. Syst..

[34]  Radu Prodan,et al.  Scheduling of scientific workflows in the ASKALON grid environment , 2005, SGMD.

[35]  Deo Prakash Vidyarthi,et al.  A Cost-Effective Deadline-Constrained Dynamic Scheduling Algorithm for Scientific Workflows in a Cloud Environment , 2018, IEEE Transactions on Cloud Computing.

[36]  Jin-Soo Kim,et al.  BTS: Resource capacity estimate for time-targeted science workflows , 2011, J. Parallel Distributed Comput..

[37]  Jorge-Arnulfo Quiané-Ruiz,et al.  Runtime measurements in the cloud , 2010, Proc. VLDB Endow..

[38]  Jin-Soo Kim,et al.  Cost optimized provisioning of elastic resources for application workflows , 2011, Future Gener. Comput. Syst..

[39]  Xiaohui Liu,et al.  Evolutionary Multi-Objective Workflow Scheduling in Cloud , 2016, IEEE Transactions on Parallel and Distributed Systems.

[40]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..