Towards Accurate Prediction for High-Dimensional and Highly-Variable Cloud Workloads with Deep Learning

Resource provisioning for cloud computing necessitates the adaptive and accurate prediction of cloud workloads. However, the existing methods cannot effectively predict the high-dimensional and highly-variable cloud workloads. This results in resource wasting and inability to satisfy service level agreements (SLAs). Since recurrent neural network (RNN) is naturally suitable for sequential data analysis, it has been recently used to tackle the problem of workload prediction. However, RNN often performs poorly on learning long-term memory dependencies, and thus cannot make the accurate prediction of workloads. To address these important challenges, we propose a deep Learning based Prediction Algorithm for cloud Workloads (L-PAW). First, a top-sparse auto-encoder (TSA) is designed to effectively extract the essential representations of workloads from the original high-dimensional workload data. Next, we integrate TSA and gated recurrent unit (GRU) block into RNN to achieve the adaptive and accurate prediction for highly-variable workloads. Using real-world workload traces from Google and Alibaba cloud data centers and the DUX-based cluster, extensive experiments are conducted to demonstrate the effectiveness and adaptability of the L-PAW for different types of workloads with various prediction lengths. Moreover, the performance results show that the L-PAW achieves superior prediction accuracy compared to the classic RNN-based and other workload prediction methods for high-dimensional and highly-variable real-world cloud workloads.

[1]  Wenbin Yao,et al.  Applying gated recurrent units pproaches for workload prediction , 2018, NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium.

[2]  Albert Y. Zomaya,et al.  Adaptive Resource Allocation and Provisioning in Multi-Service Cloud Environments , 2018, IEEE Transactions on Parallel and Distributed Systems.

[3]  Farokh B. Bastani,et al.  Improving the Smartness of Cloud Management via Machine Learning Based Workload Prediction , 2018, 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).

[4]  Kazuhiro Matsuda,et al.  Self-Aware Workload Forecasting in Data Center Power Prediction , 2018, 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[5]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[6]  Bo Li,et al.  Workload Prediction for Cloud Cluster Using a Recurrent Neural Network , 2016, 2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI).

[7]  Inderveer Chana,et al.  An intelligent regressive ensemble approach for predicting resource usage in cloud computing , 2019, J. Parallel Distributed Comput..

[8]  Jie Zheng,et al.  Energy efficient job scheduling with workload prediction on cloud data center , 2018, Cluster Computing.

[9]  Zheng Huang,et al.  Deep Recurrent Model for Server Load and Performance Prediction in Data Center , 2017, Complex..

[10]  Huan Liu,et al.  A Measurement Study of Server Utilization in Public Clouds , 2011, 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing.

[11]  Laurence T. Yang,et al.  An Efficient Deep Learning Model to Predict Cloud Workload for Industry Informatics , 2018, IEEE Transactions on Industrial Informatics.

[12]  Javad Akbari Torkestani,et al.  A learning automata-based algorithm for energy and SLA efficient consolidation of virtual machines in cloud data centers , 2018, J. Parallel Distributed Comput..

[13]  Hai Jin,et al.  When smart grid meets geo-distributed cloud: An auction approach to datacenter demand response , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[14]  R Vinoth,et al.  Adaptive Resource Allocation and Provisioning in Multi-Service Cloud Environments , 2019 .

[15]  Ricardo Bianchini,et al.  Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms , 2017, SOSP.

[16]  Aditya Nigam,et al.  Association Learning based Hybrid Model for Cloud Workload Prediction , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[17]  Yu Zhou,et al.  Host load prediction with long short-term memory in cloud computing , 2017, The Journal of Supercomputing.

[18]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[19]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[20]  Yu Zhou,et al.  Multi-step-ahead host load prediction using autoencoder and echo state networks in cloud computing , 2015, The Journal of Supercomputing.

[21]  Enda Barrett,et al.  Predicting host CPU utilization in the cloud using evolutionary neural networks , 2018, Future Gener. Comput. Syst..

[22]  Peter A. Dinda,et al.  The statistical properties of host load , 1999, Sci. Program..

[23]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[24]  Yuan Zhang,et al.  Short-Term Residential Load Forecasting Based on LSTM Recurrent Neural Network , 2019, IEEE Transactions on Smart Grid.

[25]  Ran Li,et al.  Deep Learning for Household Load Forecasting—A Novel Pooling Deep RNN , 2018, IEEE Transactions on Smart Grid.

[26]  Bo Deng,et al.  Workload prediction for cloud computing elasticity mechanism , 2016, 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA).

[27]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[28]  Satoshi Matsuoka,et al.  Predicting Performance Using Collaborative Filtering , 2018, 2018 IEEE International Conference on Cluster Computing (CLUSTER).

[29]  Jitendra Kumar,et al.  Workload prediction in cloud using artificial neural network and adaptive differential evolution , 2018, Future Gener. Comput. Syst..

[30]  Enda Barrett,et al.  Predicting host CPU utilization in cloud computing using recurrent neural networks , 2017, 2017 12th International Conference for Internet Technology and Secured Transactions (ICITST).

[31]  Geyong Min,et al.  Learning-Based Resource Allocation in Cloud Data Center using Advantage Actor-Critic , 2019, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[32]  Song Guo,et al.  Robust Big Data Analytics for Electricity Price Forecasting in the Smart Grid , 2019, IEEE Transactions on Big Data.

[33]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[34]  Jing Guo,et al.  Who Limits the Resource Efficiency of My Datacenter: An Analysis of Alibaba Datacenter Traces , 2019, 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS).

[35]  Zhi Zhou,et al.  Cost-Effective Cloud Server Provisioning for Predictable Performance of Big Data Analytics , 2019, IEEE Transactions on Parallel and Distributed Systems.