An adaptive fault detector strategy for scientific workflow scheduling based on improved differential evolution algorithm in cloud

Abstract With the increasing popularity and acceptance of cloud computing, it is being applied in services like executing large-scale applications, where cloud environment is selected by the scientific associations to easily execute the computation intensive workflows. However, cloud computing can have higher failure rates due to the larger number of servers and components filled with the intensive workloads. These failures may lead to the unavailability of virtual machines (VMs) for computation. Hence, this issue of fault occurrences can be tolerated by adopting an effective and efficient fault tolerant strategy. The goal of our research in this paper is to develop an adaptive fault detector strategy based on Improved Differential Evolution (IDE) algorithm in cloud computing that can minimize the energy consumption, the makespan, the total cost and, at the same time, tolerate up faults when scheduling scientific workflows. This proposed work applies an adaptive network-based fuzzy inference system (ANFIS) prediction model to proactively control resource load fluctuation that increases the failure prediction accuracy before fault/failure occurrence. In addition, it applies a reactive fault tolerance technique for when a processor fails and the scheduler must allocate a new VM to execute the workflow tasks. The experimental results show that compared with existing techniques, the proposed approach significantly improves the overall scheduling performance, achieves a higher degree of fault tolerance with high HyperVolume (HV) compared with the ICFWS, IDE, and ACO algorithms, minimizes the makespan, the energy consumption and task fault ratio, and reduces the total cost.

[1]  Thomas Fahringer,et al.  Evolutionary Multi-Objective Workflow Scheduling for Volatile Resources in the Cloud , 2022, IEEE Transactions on Cloud Computing.

[2]  Miron Livny,et al.  Pegasus, a workflow management system for science automation , 2015, Future Gener. Comput. Syst..

[3]  Reza Tavoli,et al.  A new approach to improve load balancing for increasing fault tolerance and decreasing energy consumption in cloud computing , 2015, 2015 2nd International Conference on Knowledge-Based Engineering and Innovation (KBEI).

[4]  Qiang Guo,et al.  Task scheduling based on ant colony optimization in cloud environment , 2017 .

[5]  Yongsheng Ding,et al.  Fault-tolerant elastic scheduling algorithm for workflow in Cloud systems , 2017, Inf. Sci..

[6]  Arun Kumar Sangaiah,et al.  Energy-Aware Fault-Tolerant Dynamic Task Scheduling Scheme for Virtualized Cloud Data Centers , 2018, Mobile Networks and Applications.

[7]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[8]  Medhat A. Tawfeek,et al.  Cloud task scheduling based on ant colony optimization , 2013, 2013 8th International Conference on Computer Engineering & Systems (ICCES).

[9]  Reihaneh Khorsand,et al.  An efficient data hiding method using the intra prediction modes in HEVC , 2020, Multimedia Tools and Applications.

[10]  Mohammad Masdari,et al.  Resource provisioning using workload clustering in cloud computing environment: a hybrid approach , 2020, Cluster Computing.

[11]  Mehran Mohsenzadeh,et al.  ATSDS: adaptive two-stage deadline-constrained workflow scheduling considering run-time circumstances in cloud computing environments , 2017, The Journal of Supercomputing.

[12]  Qiang Li,et al.  Template-Based Genetic Algorithm for QoS-Aware Task Scheduling in Cloud Computing , 2016, 2016 International Conference on Advanced Cloud and Big Data (CBD).

[13]  Ritu Garg,et al.  Reliability and energy efficient workflow scheduling in cloud environment , 2019, Cluster Computing.

[14]  Rifat Hamoudi NetBookingLIMS: Automation of Workflow within Core Genetics Laboratories Using a Novel Internet-Based Information Management System , 2001 .

[15]  P. Jayarekha,et al.  Virtual machine migration based load balancing for resource management and scalability in cloud environment , 2018, International Journal of Information Technology.

[16]  Saeid Barshandeh,et al.  Sink selection and clustering using fuzzy‐based controller for wireless sensor networks , 2020, Int. J. Commun. Syst..

[17]  Marco Laumanns,et al.  Performance assessment of multiobjective optimizers: an analysis and review , 2003, IEEE Trans. Evol. Comput..

[18]  Kuangrong Hao,et al.  An integrated algorithm for multi-agent fault-tolerant scheduling based on MOEA , 2019, Future Gener. Comput. Syst..

[19]  Shichuan Wang,et al.  Multi-objective Task Scheduling Optimization in Cloud Computing based on Genetic Algorithm and Differential Evolution Algorithm , 2018, 2018 37th Chinese Control Conference (CCC).

[20]  Abderrazak Jemai,et al.  Performance improvement of the particle swarm optimisation algorithm for the flexible job shop problem under machines breakdown , 2018, Int. J. Intell. Eng. Informatics.

[21]  Reihaneh Khorsand,et al.  Improved many-objective particle swarm optimization algorithm for scientific workflow scheduling in cloud computing , 2020, Comput. Ind. Eng..

[22]  Y. Shahbazi,et al.  Integrated Metaheuristic Differential Evolution Optimization Algorithm and Pseudo Static Analysis of Concrete Gravity Dam , 2017 .

[23]  D. C. Verma,et al.  A toolkit for policy enablement in autonomic computing , 2004 .

[24]  David E. Goldberg,et al.  Learning Linkage , 1996, FOGA.

[25]  Ning Li,et al.  Cloud reliability and efficiency improvement via failure risk based proactive actions , 2020, J. Syst. Softw..

[26]  Behzad Soleimani Neysiani,et al.  Recommendation Systems Based on Association Rule Mining for a Target Object by Evolutionary Algorithms , 2018 .

[27]  Tao Zhang,et al.  A multi-objective co-evolutionary algorithm for energy-efficient scheduling on a green data center , 2016, Comput. Oper. Res..

[28]  Hamid Mirvaziri,et al.  Attacks and Intrusion Detection in Cloud Computing Using Neural Networks and Particle Swarm Optimization Algorithms , 2018 .

[29]  Yongsheng Ding,et al.  Using Imbalance Characteristic for Fault-Tolerant Workflow Scheduling in Cloud Systems , 2017, IEEE Transactions on Parallel and Distributed Systems.

[30]  Xu Zhou,et al.  Fault-Tolerant Dynamic Rescheduling for Heterogeneous Computing Systems , 2015, Journal of Grid Computing.

[31]  Mostafa Ghobaei-Arani,et al.  A self‐learning fuzzy approach for proactive resource provisioning in cloud environment , 2019, Softw. Pract. Exp..

[32]  Shiming He,et al.  A Multi-objective Optimization Scheduling Method Based on the Improved Differential Evolution Algorithm in Cloud Computing , 2017, ICCCS.

[33]  Leila Esmaeili,et al.  An elastic controller using Colored Petri Nets in cloud computing environment , 2019, Cluster Computing.

[34]  Kuo-Chan Huang,et al.  Task ranking and allocation in list-based workflow scheduling on parallel computing platform , 2014, The Journal of Supercomputing.

[35]  Roman Kostromin,et al.  Multi-agent Algorithm for Re-allocating Grid-resources and Improving Fault-tolerance of Problem-solving Processes , 2019 .

[36]  D. Goldberg,et al.  Linkage learning through probabilistic expression , 2000 .

[37]  Parmeet Kaur,et al.  Resource provisioning and work flow scheduling in clouds using augmented Shuffled Frog Leaping Algorithm , 2017, J. Parallel Distributed Comput..

[38]  Walid Saad,et al.  Designing and implementing a cloud-hosted SaaS for data movement and sharing with SlapOS , 2014, Int. J. Big Data Intell..

[39]  Jia Liu,et al.  An Improved Differential Evolution Task Scheduling Algorithm Based on Cloud Computing , 2018, 2018 17th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES).

[40]  Claus Pahl,et al.  A Fuzzy Load Balancer for Adaptive Fault Tolerance Management in Cloud Platforms , 2017, ESOCC.

[41]  Reihaneh Khorsand,et al.  PL-DVFS: combining Power-aware List-based scheduling algorithm with DVFS technique for real-time tasks in Cloud Computing , 2018, The Journal of Supercomputing.

[42]  Oscar Castillo,et al.  A new approach for time series prediction using ensembles of ANFIS models , 2012, Expert Syst. Appl..

[43]  Tarq Zaed Khalaf,et al.  Particle Swarm Optimization Based Approach for Estimation of Costs and Duration of Construction Projects , 2020 .

[44]  Meysam Motahari,et al.  Development of a PSO-ANN Model for Rainfall-Runoff Response in Basins, Case Study: Karaj Basin , 2017 .

[45]  Jun Zhang,et al.  A set-based discrete PSO for cloud workflow scheduling with user-defined QoS constraints , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[46]  Reihaneh Khorsand,et al.  An energy‐efficient task‐scheduling algorithm based on a multi‐criteria decision‐making method in cloud computing , 2020, Int. J. Commun. Syst..

[47]  Mehran Mohsenzadeh,et al.  Taxonomy of workflow partitioning problems and methods in distributed environments , 2017, J. Syst. Softw..

[48]  Mostafa Ghobaei-Arani,et al.  An autonomous resource provisioning framework for massively multiplayer online games in cloud environment , 2019, J. Netw. Comput. Appl..

[49]  Paul J. Kühn,et al.  DVFS-Power Management and Performance Engineering of Data Center Server Clusters , 2019, 2019 15th Annual Conference on Wireless On-demand Network Systems and Services (WONS).

[50]  Ewa Deelman,et al.  WorkflowSim: A toolkit for simulating scientific workflows in distributed environments , 2012, 2012 IEEE 8th International Conference on E-Science.

[51]  Imed Eddine Bennour,et al.  A two-level particle swarm optimization algorithm for the flexible job shop scheduling problem , 2019, Swarm Intelligence.

[52]  Reihaneh Khorsand,et al.  Energy-aware scheduling algorithm for time-constrained workflow tasks in DVFS-enabled cloud environment , 2018, Simul. Model. Pract. Theory.

[53]  Sameh A. Salem,et al.  A smart energy and reliability aware scheduling algorithm for workflow execution in DVFS-enabled cloud environment , 2020, Future Gener. Comput. Syst..

[54]  Reihaneh Khorsand,et al.  An adaptive scheduling approach based on integrated best-worst and VIKOR for cloud computing , 2020, Comput. Ind. Eng..

[55]  Qian He,et al.  Cloud computing task scheduling strategy based on improved differential evolution algorithm , 2017 .

[56]  Xiaohui Liu,et al.  Evolutionary Multi-Objective Workflow Scheduling in Cloud , 2016, IEEE Transactions on Parallel and Distributed Systems.

[57]  T. Tamilvizhi,et al.  A novel method for adaptive fault tolerance during load balancing in cloud computing , 2017, Cluster Computing.