Learning Predictive Autoscaling Policies for Cloud-Hosted Microservices Using Trace-Driven Modeling

Autoscaling methods are important to ensure response time guarantees for cloud-hosted microservices. Most of the existing state-of-the-art autoscaling methods use rule-based reactive policies with static thresholds defined either on monitored resource consumption metrics such as CPU and memory utilization or application-level metrics such as the response time. However, it is challenging to determine the most appropriate threshold values to minimize resource consumption and performance violations. Whereas, predictive autoscaling methods can help to address these challenges. These methods require considerable time to collect sufficient performance traces representing different resource provisioning possibilities for a target infrastructure to train a useful predictive autoscaling model. In this paper, we tackle this problem by proposing a system that models the response time of microservices through stress testing and then uses a trace-driven simulation to learn a predictive autoscaling model for satisfying response time requirements automatically. The proposed solution reduces the need for collecting performance traces to learn a predictive autoscaling model. Our experimental evaluation on AWS cloud using a microservice under realistic dynamic workloads validates the proposed solution. The validation results show excellent performance to satisfy the response time requirement with only 4.5% extra cost for using the proposed autoscaling method compared to the reactive autoscaling method.

[1]  Rajkumar Buyya,et al.  A Fuzzy-Based Auto-scaler for Web Applications in Cloud Computing Environments , 2018, ICSOC.

[2]  Waheed Iqbal,et al.  Predictive Auto-Scaling of Multi-Tier Applications Using Performance Varying Cloud Resources , 2022, IEEE Transactions on Cloud Computing.

[3]  Dieter Kranzlmüller,et al.  Building an open source cloud environment with auto-scaling resources for executing bioinformatics and biomedical workflows , 2017, Future Gener. Comput. Syst..

[4]  Waheed Iqbal,et al.  SLA-Driven Adaptive Resource Management for Web Applications on a Heterogeneous Compute Cloud , 2009, CloudCom.

[5]  Waheed Iqbal,et al.  Dynamic workload patterns prediction for proactive auto-scaling of web applications , 2018, J. Netw. Comput. Appl..

[6]  E. G. Radhika,et al.  An Efficient Predictive technique to Autoscale the Resources for Web applications in Private cloud , 2018, 2018 Fourth International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB).

[7]  Yan Liu,et al.  Online machine learning for cloud resource provisioning of microservice backend systems , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[8]  Rajkumar Buyya,et al.  Auto-Scaling Web Applications in Clouds , 2018, ACM Comput. Surv..

[9]  José Antonio Lozano,et al.  A Review of Auto-scaling Techniques for Elastic Applications in Cloud Environments , 2014, Journal of Grid Computing.

[10]  Waheed Iqbal,et al.  Unsupervised learning approach for web application auto-decomposition into microservices , 2019, J. Syst. Softw..

[11]  Josef Spillner,et al.  Towards Quantifiable Boundaries for Elastic Horizontal Scaling of Microservices , 2017, UCC.

[12]  Guillaume Pierre,et al.  Wikipedia workload analysis for decentralized hosting , 2009, Comput. Networks.

[13]  Waheed Iqbal,et al.  Unsupervised Learning of Dynamic Resource Provisioning Policies for Cloud-Hosted Multitier Web Applications , 2016, IEEE Systems Journal.

[14]  Waheed Iqbal,et al.  Adaptive resource provisioning for read intensive multi-tier applications in the cloud , 2011, Future Gener. Comput. Syst..

[15]  David Mosberger,et al.  httperf—a tool for measuring web server performance , 1998, PERV.

[16]  Michael Gerndt,et al.  Performance Modeling for Cloud Microservice Applications , 2019, ICPE.

[17]  Roberto Baldoni,et al.  An Architecture for Automatic Scaling of Replicated Services , 2014, NETYS.