Auto scaling virtual machines for web applications with queueing theory

With the rapid development of cloud computing in recent years, more and more individuals and corporations use cloud computing platform to deploy their web applications, which can significantly minimize their deployment costs. However, it is observed that the number of accesses to some web application often fluctuates over time, resulting in the so-called peak-valley phenomenon: the amount of reserved resources is often proportional to the peak need of physical resources, while most of the time the amount of required resources is far below the peak load and thus physical servers will be idle for most of the time. To solve this problem, we establish a queuing model M/M/C, which represents infinite source and multi-service window. Based on this queueing model, we can accurately predict the arrival time of each customer, which enables us to calculate the minimum amount of resources that meet the resource needs. Then, we use heuristic algorithms and dynamic programming method to design a Virtual Machine (VM) auto-scaling strategies, including horizontal scaling and vertical scaling. With the proposed model and scaling algorithms, we can make web applications not only meet customer needs, but also use the least amount of resources, improving the resource utilization and minimizing deployment costs. With extensive experiments, we show the proposed model and scaling algorithms can greatly improve resource utilization without sacrificing web application performance.

[1]  Enda Barrett,et al.  Applying reinforcement learning towards automating resource allocation and application scalability in the cloud , 2013, Concurr. Comput. Pract. Exp..

[2]  Jeffrey S. Chase,et al.  Automated control for elastic storage , 2010, ICAC '10.

[3]  Jie Wu,et al.  An Opportunistic Resource Sharing and Topology-Aware mapping framework for virtual networks , 2012, 2012 Proceedings IEEE INFOCOM.

[4]  Rajarshi Das,et al.  A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation , 2006, 2006 IEEE International Conference on Autonomic Computing.

[5]  Johan Tordsson,et al.  An adaptive hybrid elasticity controller for cloud infrastructures , 2012, 2012 IEEE Network Operations and Management Symposium.

[6]  José Antonio Lozano,et al.  A Review of Auto-scaling Techniques for Elastic Applications in Cloud Environments , 2014, Journal of Grid Computing.

[7]  Jun Han,et al.  A multi-model framework to implement self-managing control systems for QoS management , 2011, SEAMS '11.

[8]  Jie Wu,et al.  Burstiness-Aware Resource Reservation for Server Consolidation in Computing Clouds , 2016, IEEE Transactions on Parallel and Distributed Systems.

[9]  J Jiang,et al.  Optimised auto-scaling for cloud-based web service , 2015 .

[10]  Robert B. Cooper,et al.  An Introduction To Queueing Theory , 2016 .

[11]  Jie Wu,et al.  Virtual Network Embedding with Opportunistic Resource Sharing , 2014, IEEE Transactions on Parallel and Distributed Systems.