Task scheduling, resource provisioning, and load balancing on scientific workflows using parallel SARSA reinforcement learning agents and genetic algorithm

Cloud computing is one of the most popular distributed environments, in which, multiple powerful and heterogeneous resources are used by different user applications. Task scheduling and resource provisioning are two important challenges of cloud environment, called cloud resource management. Resource management is a major problem especially for scientific workflows due to their heavy calculations and dependency between their operations. Several algorithms and methods have been developed to manage cloud resources. In this paper, the combination of state-action-reward-state-action learning and genetic algorithm is used to manage cloud resources. At the first step, the intelligent agents schedule the tasks during the learning process by exploring the workflow. Then, in the resource provisioning step, each resource is assigned to an agent, and its utilization is attempted to be maximized in the learning process of its corresponding agent. This is conducted by selecting the most appropriate set of the tasks that maximizes the utilization of the resource. Genetic algorithm is utilized for convergence of the agents of the proposed method, and to achieve global optimization. The fitness function that has been exploited by this genetic algorithm seeks to achieve more efficient resource utilization and better load balancing by observing the deadlines of the tasks. The experimental results show that the proposed algorithm reduces makespan, enhances resource utilization, and improves load balancing, compared to MOHEFT and MCP, the well-known workflow scheduling algorithms of the literature.

[1]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[2]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[3]  C. Watkins Learning from delayed rewards , 1989 .

[4]  Lawrence. Davis,et al.  Handbook Of Genetic Algorithms , 1990 .

[5]  Nostrand Reinhold,et al.  the utility of using the genetic algorithm approach on the problem of Davis, L. (1991), Handbook of Genetic Algorithms. Van Nostrand Reinhold, New York. , 1991 .

[6]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[7]  George H. John When the Best Move Isn't Optimal: Q-learning with Exploration , 1994, AAAI.

[8]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[9]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[10]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[11]  Martin T. Hagan,et al.  Neural network design , 1995 .

[12]  Dan Boneh,et al.  On genetic algorithms , 1995, COLT '95.

[13]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[14]  Thomas Bäck,et al.  Evolutionary Algorithms in Theory and Practice , 1996 .

[15]  Thomas Bäck,et al.  Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .

[16]  Ishfaq Ahmad,et al.  Benchmarking the task graph scheduling algorithms , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[17]  Myoung-Ho Kim,et al.  Critical path identification in the context of a workflow , 2002, Inf. Softw. Technol..

[18]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[19]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[20]  Julian Padget,et al.  Markets vs auctions: Approaches to distributed combinatorial resource scheduling , 2005, Multiagent Grid Syst..

[21]  Kristina Lerman,et al.  Resource Allocation in the Grid with Learning Agents , 2005, Journal of Grid Computing.

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  Rajkumar Buyya,et al.  Cluster Computing: High-Performance, High-Availability, and High-Throughput Processing on a Network of Computers , 2006, Handbook of Nature-Inspired and Innovative Computing.

[24]  David W. Coit,et al.  Multi-objective optimization using genetic algorithms: A tutorial , 2006, Reliab. Eng. Syst. Saf..

[25]  Jano I. van Hemert,et al.  Scientific Workflow: A Survey and Research Directions , 2007, PPAM.

[26]  Pankesh Patel,et al.  Service Level Agreement in Cloud Computing , 2009 .

[27]  Thomas M. Keane,et al.  Multi-heuristic dynamic task allocation using genetic algorithms in a heterogeneous distributed system , 2010, J. Parallel Distributed Comput..

[28]  Raouf Boutaba,et al.  Cloud computing: state-of-the-art and research challenges , 2010, Journal of Internet Services and Applications.

[29]  Prashant Pandey,et al.  Cloud computing , 2010, ICWET.

[30]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[31]  Lee Gillam,et al.  Cloud Computing, Principles, Systems and Applications , 2010, Cloud Computing.

[32]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[33]  M. Dufwenberg Game theory. , 2011, Wiley interdisciplinary reviews. Cognitive science.

[34]  Cheng-Zhong Xu,et al.  URL: A unified reinforcement learning approach for autonomic cloud management , 2012, J. Parallel Distributed Comput..

[35]  T. P. Singh,et al.  The Distributed Computing Paradigms: P2P, Grid, Cluster, Cloud, and Jungle , 2013, ArXiv.

[36]  Radu Prodan,et al.  Multi-objective workflow scheduling in Amazon EC2 , 2014, Cluster Computing.

[37]  Cheng-Ming Zou,et al.  A Task Scheduling Algorithm Based on Genetic Algorithm and Ant Colony Optimization in Cloud Computing , 2014, 2014 13th International Symposium on Distributed Computing and Applications to Business, Engineering and Science.

[38]  Rolf Stadler,et al.  Resource Management in Clouds: Survey and Research Challenges , 2015, Journal of Network and Systems Management.

[39]  Kenli Li,et al.  A genetic algorithm for task scheduling on heterogeneous computing systems using multiple priority queues , 2014, Inf. Sci..

[40]  Sherali Zeadally,et al.  A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems , 2016, Computing.

[41]  Sunilkumar S. Manvi,et al.  Resource management for Infrastructure as a Service (IaaS) in cloud computing: A survey , 2014, J. Netw. Comput. Appl..

[42]  Carlos Becker Westphall,et al.  Cloud resource management: A survey on forecasting and profiling models , 2015, J. Netw. Comput. Appl..

[43]  M. Corazza,et al.  Q-Learning and SARSA: A Comparison between Two Intelligent Stochastic Control Approaches for Financial Trading , 2015 .

[44]  Qingbo Wu,et al.  Workflow scheduling in cloud: a survey , 2015, The Journal of Supercomputing.

[45]  Inderveer Chana,et al.  A Survey on Resource Scheduling in Cloud Computing: Issues and Challenges , 2016, Journal of Grid Computing.

[46]  Weiwei Lin,et al.  Random task scheduling scheme based on reinforcement learning in cloud computing , 2015, Cluster Computing.

[47]  Sarbjeet Singh,et al.  A review of metaheuristic scheduling techniques in cloud computing , 2015 .

[48]  Beniamino Di Martino,et al.  Cloud Computing: Security, Privacy and Practice , 2015, Future Gener. Comput. Syst..

[49]  Jarek Nabrzyski,et al.  Algorithms for cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2015 .

[50]  Valentin Cristea,et al.  Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing , 2015, Future Gener. Comput. Syst..

[51]  A. S. Ajeena Beegom,et al.  Genetic Algorithm Framework for Bi-objective Task Scheduling in Cloud Computing Systems , 2015, ICDCIT.

[52]  Vinayak D. Shinde,et al.  Load Balancing Algorithms in Cloud Computing , 2016 .

[53]  Chee Sun Liew,et al.  A hybrid genetic algorithm for optimization of scheduling workflow applications in heterogeneous computing systems , 2016, J. Parallel Distributed Comput..

[54]  Ping Zhang,et al.  A hybrid discrete particle swarm optimization-genetic algorithm for multi-task scheduling problem in service oriented manufacturing systems , 2016 .

[55]  Jun Li,et al.  Load balancing task scheduling based on Multi-Population Genetic Algorithm in cloud computing , 2016, CCC 2016.

[56]  Rajkumar Buyya,et al.  A taxonomy and survey on scheduling algorithms for scientific workflows in IaaS cloud computing environments , 2017, Concurr. Comput. Pract. Exp..

[57]  Enda Barrett,et al.  A reinforcement learning approach for the scheduling of live migration from under utilised hosts , 2016, Memetic Computing.

[58]  Qinru Qiu,et al.  A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[59]  Nima Jafari Navimipour,et al.  An improved genetic algorithm for task scheduling in the cloud environments using the priority queues: Formal verification, simulation, and statistical testing , 2017, J. Syst. Softw..

[60]  Hassan Rashidi,et al.  An enhanced genetic algorithm with new operators for task scheduling in heterogeneous computing systems , 2017, Eng. Appl. Artif. Intell..

[61]  Amir Masoud Rahmani,et al.  Load-balancing algorithms in cloud computing: A survey , 2017, J. Netw. Comput. Appl..

[62]  Rajkumar Buyya,et al.  A survey on load balancing algorithms for virtual machines placement in cloud computing , 2016, Concurr. Comput. Pract. Exp..

[63]  Charles Miers,et al.  Cloud resource management: towards efficient execution of large-scale scientific applications and workflows on complex infrastructures , 2017, Journal of Cloud Computing.

[64]  Rajkumar Buyya,et al.  Scheduling dynamic workloads in multi-tenant scientific workflow as a service platforms , 2018, Future Gener. Comput. Syst..

[65]  Han Yuan,et al.  Pricing Cloud Resource Based on Multi-Agent Reinforcement Learning in the Competing Environment , 2018, 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom).

[66]  Han Yuan,et al.  Pricing Cloud Resource Based on Reinforcement Learning in the Competing Environment , 2018, CLOUD.

[67]  Yu Zhang,et al.  Intelligent Cloud Resource Management with Deep Reinforcement Learning , 2018, IEEE Cloud Computing.

[68]  Florin Pop,et al.  New scheduling approach using reinforcement learning for heterogeneous distributed systems , 2017, J. Parallel Distributed Comput..

[69]  Weipeng Jing,et al.  Reliability Enhancement in Cloud Computing Via Optimized Job Scheduling Implementing Reinforcement Learning Algorithm and Queuing Theory , 2018, 2018 1st International Conference on Data Intelligence and Security (ICDIS).

[70]  J. V. Bibal Benifa,et al.  RLPAS: Reinforcement Learning-based Proactive Auto-Scaler for Resource Provisioning in Cloud Environment , 2018, Mobile Networks and Applications.

[71]  Hong Liu,et al.  QL-HEFT: a novel machine learning scheduling scheme base on cloud computing environment , 2019, Neural Computing and Applications.

[72]  Francisco Heron de Carvalho Junior,et al.  A Scientific Workflow Management System for orchestration of parallel components in a cloud of large-scale parallel processing services , 2019, Sci. Comput. Program..

[73]  Dejey Dharma,et al.  RLPAS: Reinforcement Learning-based Proactive Auto-Scaler for Resource Provisioning in Cloud Environment , 2019, Mob. Networks Appl..

[74]  Vijayan Sugumaran,et al.  Task scheduling techniques in cloud computing: A literature survey , 2019, Future Gener. Comput. Syst..

[75]  Mohammad Karim Sohrabi,et al.  A cloud resource management framework for multiple online scientific workflows using cooperative reinforcement learning agents , 2020, Comput. Networks.

[76]  Mohammad Karim Sohrabi,et al.  Online scheduling of dependent tasks of cloud’s workflows to enhance resource utilization and reduce the makespan using multiple reinforcement learning-based agents , 2020, Soft Computing.