Auto-Scaling Cloud Resources using LSTM and Reinforcement Learning to Guarantee Service-Level Agreements and Reduce Resource Costs

Auto-Scaling cloud resources aim at responding to application demands by automatically scaling the compute resources at runtime to guarantee service-level agreements (SLAs) and reduce resource costs. Existing approaches often resort to predefined sets of rules to add/remove resources depending on the application usage. However, optimal adaptation rules are difficult to devise and generalize. A proactive approach is proposed to perform auto-scaling cloud resources in response to dynamic traffic changes. This paper applies Long Short-Term Memory (LSTM) to predicting the accurate number of requests in the next time and applies Reinforcement Learning (RL) to obtaining the optimal action to scale in or scale out virtual machines. To validate the proposal, experiments under two real-world workload traces are conducted, and the results show that the approach can ensure virtual machines to work steadily and can reduce SLA violations by up to 10%-30% compared with other approaches.

[1]  Claus Pahl,et al.  Cloud Migration Research: A Systematic Review , 2013, IEEE Transactions on Cloud Computing.

[2]  Bo Cheng,et al.  A cost-aware auto-scaling approach using the workload prediction in service clouds , 2014, Inf. Syst. Frontiers.

[3]  Enda Barrett,et al.  CPU workload forecasting of machines in data centers using LSTM recurrent neural networks and ARIMA models , 2017, 2017 12th International Conference for Internet Technology and Secured Transactions (ICITST).

[4]  Abhigyan Nath,et al.  Missing QoS-values predictions using neural networks for cloud computing environments , 2015, 2015 International Conference on Computing and Network Communications (CoCoNet).

[5]  José Antonio Lozano,et al.  A Review of Auto-scaling Techniques for Elastic Applications in Cloud Environments , 2014, Journal of Grid Computing.

[6]  Marcos José Santana,et al.  Combining time series prediction models using genetic algorithm to autoscaling Web applications hosted in the cloud infrastructure , 2015, Neural Computing and Applications.