Machine Learning (ML)-Centric Resource Management in Cloud Computing: A Review and Future Directions

Cloud computing has rapidly emerged as model for delivering Internet-based utility computing services. In cloud computing, Infrastructure as a Service (IaaS) is one of the most important and rapidly growing fields. Cloud providers provide users/machines resources such as virtual machines, raw (block) storage, firewalls, load balancers, and network devices in this service model. One of the most important aspects of cloud computing for IaaS is resource management. Scalability, quality of service, optimum utility, reduced overheads, increased throughput, reduced latency, specialised environment, cost effectiveness, and a streamlined interface are some of the advantages of resource management for IaaS in cloud computing. Traditionally, resource management has been done through static policies, which impose certain limitations in various dynamic scenarios, prompting cloud service providers to adopt data-driven, machine-learning-based approaches. Machine learning is being used to handle a variety of resource management tasks, including workload estimation, task scheduling, VM consolidation, resource optimization, and energy optimization, among others. This paper provides a detailed review of challenges in ML-based resource management in current research, as well as current approaches to resolve these challenges, as well as their advantages and limitations. Finally, we propose potential future research directions based on identified challenges and limitations in current research.

[1]  Ricardo Bianchini,et al.  Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms , 2017, SOSP.

[2]  Hung-yi Lee,et al.  Temporal pattern attention for multivariate time series forecasting , 2018, Machine Learning.

[3]  Artificial Intelligence (AI)-Centric Management of Resources in Modern Distributed Computing Systems , 2020, 2020 IEEE Cloud Summit.

[4]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Rajkumar Buyya,et al.  Interconnected Cloud Computing Environments , 2014, ACM Comput. Surv..

[7]  Jordi Guitart,et al.  Assessing and forecasting energy efficiency on Cloud computing platforms , 2015, Future Gener. Comput. Syst..

[8]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[9]  Yanmin Zhu,et al.  Modeling Conceptual Characteristics of Virtual Machines for CPU Utilization Prediction , 2018, ER.

[10]  Rajkumar Buyya,et al.  Dynamic resource demand prediction and allocation in multi‐tenant service clouds , 2016, Concurr. Comput. Pract. Exp..

[11]  Ayhan Demiriz,et al.  Semi-Supervised Support Vector Machines , 1998, NIPS.

[12]  Neeraja Jayant Yadwadkar,et al.  Machine Learning for Automatic Resource Management in the Datacenter and the Cloud , 2018 .

[13]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[14]  C. Bergmeir,et al.  Recurrent Neural Networks for Time Series Forecasting: Current Status and Future Directions , 2019, International Journal of Forecasting.

[15]  Inderjit S. Dhillon,et al.  Semi-supervised graph clustering: a kernel approach , 2005, Machine Learning.

[16]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[17]  Hongzi Mao,et al.  Learning scheduling algorithms for data processing clusters , 2018, SIGCOMM.

[18]  Ashutosh Kumar Singh,et al.  Self directed learning based workload forecasting model for cloud resource management , 2021, Inf. Sci..

[19]  M. Emre Celebi,et al.  Unsupervised Learning Algorithms , 2016 .

[20]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[21]  Feng Zhao,et al.  Virtual machine power metering and provisioning , 2010, SoCC '10.

[22]  Rajkumar Buyya,et al.  A survey on load balancing algorithms for virtual machines placement in cloud computing , 2016, Concurr. Comput. Pract. Exp..

[23]  Aaron Klein,et al.  Hyperparameter Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[24]  Albert Y. Zomaya,et al.  A Manifesto for Future Generation Cloud Computing: Research Directions for the Next Decade , 2017, ArXiv.

[25]  Amin Jula,et al.  Cloud computing service composition: A systematic literature review , 2014, Expert Syst. Appl..

[26]  Mohammad Masdari,et al.  Resource provisioning using workload clustering in cloud computing environment: a hybrid approach , 2020, Cluster Computing.

[27]  Fredrik Olsson,et al.  A literature survey of active machine learning in the context of natural language processing , 2009 .

[28]  Gang Wang,et al.  Load Prediction for Data Centers Based on Database Service , 2018, 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).

[29]  Abdul Majeed,et al.  Improving Time Complexity and Accuracy of the Machine Learning Algorithms Through Selection of Highly Weighted Top k Features from Complex Datasets , 2019, Annals of Data Science.

[30]  Shiliang Sun,et al.  A Survey of Optimization Methods From a Machine Learning Perspective , 2019, IEEE Transactions on Cybernetics.

[31]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[32]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Miodrag Lovric,et al.  International Encyclopedia of Statistical Science , 2011 .

[34]  MUSTAFA R. KADHIM,et al.  Rapid Clustering with Semi-Supervised Ensemble Density Centers , 2019, 2019 16th International Computer Conference on Wavelet Active Media Technology and Information Processing.

[35]  Rajkumar Buyya,et al.  SLA-based virtual machine management for heterogeneous workloads in a cloud datacenter , 2014, J. Netw. Comput. Appl..

[36]  Antti Ylä-Jääski,et al.  Virtual Machine Consolidation with Multiple Usage Prediction for Energy-Efficient Cloud Data Centers , 2020, IEEE Transactions on Services Computing.

[37]  Robert Cypher,et al.  Disks for Data Centers , 2016 .

[38]  Tharam S. Dillon,et al.  Cloud Computing: Issues and Challenges , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[39]  Rajkumar Buyya,et al.  Shared data-aware dynamic resource provisioning and task scheduling for data intensive applications on hybrid clouds using Aneka , 2020, Future Gener. Comput. Syst..

[40]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[41]  Xiaohong Jiang,et al.  Power Management of Virtualized Cloud Computing Platform , 2012 .

[42]  Ricardo Bianchini,et al.  Toward ML-centric cloud platforms , 2020, Commun. ACM.

[43]  Holger H. Hoos,et al.  A survey on semi-supervised learning , 2019, Machine Learning.

[44]  Rajkumar Buyya,et al.  Ensemble learning based predictive framework for virtual machine resource request prediction , 2020, Neurocomputing.

[45]  Enda Barrett,et al.  An energy efficient anti-correlated virtual machine placement algorithm using resource usage predictions , 2019, Simul. Model. Pract. Theory.

[46]  Daoqiang Zhang,et al.  Semi-Supervised Dimensionality Reduction ∗ , 2007 .

[47]  Claire Cardie,et al.  Clustering with Instance-Level Constraints , 2000, AAAI/IAAI.

[48]  Sunilkumar S. Manvi,et al.  Resource management for Infrastructure as a Service (IaaS) in cloud computing: A survey , 2014, J. Netw. Comput. Appl..

[49]  Elsayed E. Hemayed,et al.  Virtual machine consolidation enhancement using hybrid regression algorithms , 2017 .

[50]  Nelson L. S. da Fonseca,et al.  Estimation of the Available Bandwidth in Inter-Cloud Links for Task Scheduling in Hybrid Clouds , 2019, IEEE Transactions on Cloud Computing.

[51]  Marcos José Santana,et al.  Combining time series prediction models using genetic algorithm to autoscaling Web applications hosted in the cloud infrastructure , 2015, Neural Computing and Applications.

[52]  Qi Zhao,et al.  iMeter: An integrated VM power model based on performance profiling , 2013, Future Gener. Comput. Syst..

[53]  Michael I. Jordan,et al.  Machine learning: Trends, perspectives, and prospects , 2015, Science.

[54]  Mário A. T. Figueiredo,et al.  A Classification-Based Approach to Semi-Supervised Clustering with Pairwise Constraints , 2020, Neural Networks.

[55]  Hui Yang,et al.  A comprehensive study of eleven feature selection algorithms and their impact on text classification , 2017, 2017 Computing Conference.

[56]  Andreas Geiger,et al.  Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art , 2017, Found. Trends Comput. Graph. Vis..

[57]  Zhi-Hua Zhou,et al.  Semi-Supervised Regression with Co-Training , 2005, IJCAI.

[58]  Oleg A. Yakimenko,et al.  Mobile system for precise aero delivery with global reach network capability , 2009, 2009 IEEE International Conference on Control and Automation.

[59]  Ali Miri,et al.  Using ELM Techniques to Predict Data Centre VM Requests , 2015, 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing.

[60]  Vladlen Koltun,et al.  An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.

[61]  Carmel Majidi,et al.  Machine Learning for Soft Robotic Sensing and Control , 2020, Adv. Intell. Syst..

[62]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[63]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[64]  G H Ball,et al.  A clustering technique for summarizing multivariate data. , 1967, Behavioral science.

[65]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[66]  Kotagiri Ramamohanarao,et al.  Thermal Prediction for Efficient Energy Management of Clouds Using Machine Learning , 2020, IEEE Transactions on Parallel and Distributed Systems.

[67]  Akshi Kumar,et al.  Information Retrieval and Machine Learning: Supporting Technologies for Web Mining Research and Practice , 2008, Webology.

[68]  Ashutosh Kumar Singh,et al.  Secure and energy aware load balancing framework for cloud data centre networks , 2019, Electronics Letters.

[69]  Siamak Mohammadi,et al.  Prediction-based underutilized and destination host selection approaches for energy-efficient dynamic VM consolidation in data centers , 2020, The Journal of Supercomputing.

[70]  Martin Molina,et al.  A tenant-based resource allocation model for scaling Software-as-a-Service applications over cloud computing infrastructures , 2013, Future Gener. Comput. Syst..

[71]  Tahani Alqurashi,et al.  Clustering ensemble method , 2018, International Journal of Machine Learning and Cybernetics.

[72]  Cordelia Schmid,et al.  Multimodal semi-supervised learning for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[73]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[74]  Tossapon Boongoen,et al.  Cluster ensembles: A survey of approaches with recent extensions and applications , 2018, Comput. Sci. Rev..

[75]  Jitendra Kumar,et al.  Workload prediction in cloud using artificial neural network and adaptive differential evolution , 2018, Future Gener. Comput. Syst..

[76]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[77]  Jack Stilgoe,et al.  Machine learning, social learning and the governance of self-driving cars , 2017, Social studies of science.

[78]  Rajkumar Buyya,et al.  Workload Prediction Using ARIMA Model and Its Impact on Cloud Applications’ QoS , 2015, IEEE Transactions on Cloud Computing.

[79]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[80]  Enda Barrett,et al.  A network aware approach for the scheduling of virtual machine migration during peak loads , 2017, Cluster Computing.

[81]  Ashutosh Kumar Singh,et al.  Cloud datacenter workload estimation using error preventive time series forecasting models , 2019, Cluster Computing.

[82]  Farid Melgani,et al.  Gaussian Process Approach to Remote Sensing Image Classification , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[83]  Rajkumar Buyya,et al.  A survey on vehicular cloud computing , 2014, J. Netw. Comput. Appl..

[84]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[85]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[86]  Sander Bohte,et al.  Conditional Time Series Forecasting with Convolutional Neural Networks , 2017, 1703.04691.

[87]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[88]  Ladan Tahvildari,et al.  Cloud Computing Uncovered: A Research Landscape , 2012, Adv. Comput..

[89]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[90]  Srikanth Kandula,et al.  Resource Management with Deep Reinforcement Learning , 2016, HotNets.

[91]  Akshat Verma,et al.  pMapper: Power and Migration Cost Aware Application Placement in Virtualized Systems , 2008, Middleware.

[92]  J. Stilgoe Machine Learning, Social Learning and the Governance of Self-Driving Cars , 2017 .

[93]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[94]  Jie Wu,et al.  Energy efficient virtual machine placement algorithm with balanced and improved resource utilization in a data center , 2013, Math. Comput. Model..

[95]  Xiao Li,et al.  Machine Learning Paradigms for Speech Recognition: An Overview , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[96]  Jinjun Chen,et al.  CPU load prediction for cloud environment based on a dynamic ensemble model , 2014, Softw. Pract. Exp..

[97]  Fang Liu,et al.  Semi-supervised double sparse graphs based discriminant analysis for dimensionality reduction , 2017, Pattern Recognit..

[98]  Pratap Chandra Sen,et al.  Supervised Classification Algorithms in Machine Learning: A Survey and Review , 2019, Advances in Intelligent Systems and Computing.

[99]  Vladimir Estivill-Castro,et al.  Fast and Robust General Purpose Clustering Algorithms , 2000, Data Mining and Knowledge Discovery.

[100]  Guokun Lai,et al.  Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , 2017, SIGIR.

[101]  Richard Wolski,et al.  Dynamically forecasting network performance using the Network Weather Service , 1998, Cluster Computing.

[102]  Kirit J. Modi,et al.  Cloud computing - concepts, architecture and challenges , 2012, 2012 International Conference on Computing, Electronics and Electrical Technologies (ICCEET).

[103]  Sunilkumar S. Manvi,et al.  Virtual resource prediction in cloud environment: A Bayesian approach , 2016, J. Netw. Comput. Appl..

[104]  Wenpeng Yin,et al.  Multichannel Variable-Size Convolution for Sentence Classification , 2015, CoNLL.

[105]  Mark Handley,et al.  The resource pooling principle , 2008, CCRV.

[106]  Rajkumar Buyya,et al.  A Survey and Taxonomy of Energy Efficient Resource Management Techniques in Platform as a Service Cloud , 2017 .

[107]  Xuan Wang,et al.  Resource provision algorithms in cloud computing: A survey , 2016, J. Netw. Comput. Appl..

[108]  Chris H. Q. Ding,et al.  Adaptive dimension reduction for clustering high dimensional data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[109]  Nirwan Ansari,et al.  Optimizing Resource Utilization of a Data Center , 2016, IEEE Communications Surveys & Tutorials.

[110]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[111]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[112]  Jim Gao,et al.  Machine Learning Applications for Data Center Optimization , 2014 .