FLAS: A combination of proactive and reactive auto-scaling architecture for distributed services

Abstract Cloud computing has established itself as the support for the vast majority of emerging technologies, mainly due to the characteristic of elasticity it offers. Auto-scalers are the systems that enable this elasticity by acquiring and releasing resources on demand to ensure an agreed service level. In this article we present FLAS (Forecasted Load Auto-Scaling), an auto-scaler for distributed services that combines the advantages of proactive and reactive approaches according to the situation to decide the optimal scaling actions in every moment. The main novelties introduced by FLAS are (i) a predictive model of the high-level metrics trend which allows to anticipate changes in the relevant SLA parameters (e.g. performance metrics such as response time or throughput) and (ii) a reactive contingency system based on the estimation of high-level metrics from resource use metrics, reducing the necessary instrumentation (less invasive) and allowing it to be adapted agnostically to different applications. We provide a FLAS implementation for the use case of a content-based publish–subscribe middleware (E-SilboPS) that is the cornerstone of an event-driven architecture. To the best of our knowledge, this is the first auto-scaling system for content-based publish–subscribe distributed systems (although it is generic enough to fit any distributed service). Through an evaluation based on several test cases recreating not only the expected contexts of use, but also the worst possible scenarios (following the Boundary-Value Analysis or BVA test methodology), we have validated our approach and demonstrated the effectiveness of our solution by ensuring compliance with performance requirements over 99% of the time.

[1]  Rajkumar Buyya,et al.  Auto-Scaling Web Applications in Clouds , 2018, ACM Comput. Surv..

[2]  Hans-Arno Jacobsen,et al.  BE-tree: an index structure to efficiently match boolean expressions over high-dimensional discrete space , 2011, SIGMOD '11.

[3]  Fermín Galán Márquez,et al.  From infrastructure delivery to service management in clouds , 2010, Future Gener. Comput. Syst..

[4]  ObliDC , 2019, Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security.

[5]  Javier Soriano,et al.  Explicit Context Matching in Content-Based Publish/Subscribe Systems , 2013, Sensors.

[6]  Emiliano Casalicchio,et al.  A study on performance measures for auto-scaling CPU-intensive containerized applications , 2019, Cluster Computing.

[7]  Alexander L. Wolf,et al.  Forwarding in a content-based network , 2003, SIGCOMM '03.

[8]  Robert H. Deng,et al.  ObliDC: An SGX-based Oblivious Distributed Computing Framework with Formal Proof , 2019, AsiaCCS.

[9]  Adrian Paschke,et al.  A Categorization Scheme for SLA Metrics , 2006, Service Oriented Electronic Commerce.

[10]  Roberto Baldoni,et al.  PASCAL: An architecture for proactive auto-scaling of distributed services , 2019, Future Gener. Comput. Syst..

[11]  Thomas S. Heinze,et al.  Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[12]  Guilherme Galante,et al.  A Survey on Cloud Computing Elasticity , 2012, 2012 IEEE Fifth International Conference on Utility and Cloud Computing.

[13]  Dimosthenis Kyriazis,et al.  Translation of application-level terms to resource-level attributes across the Cloud stack layers , 2011, 2011 IEEE Symposium on Computers and Communications (ISCC).

[14]  Zhenhuan Gong,et al.  PRESS: PRedictive Elastic ReSource Scaling for cloud systems , 2010, 2010 International Conference on Network and Service Management.

[15]  Stanley B. Zdonik,et al.  A*-tree , 2010, Proc. VLDB Endow..

[16]  Javier Soriano,et al.  A Multidomain Standards-Based Fog Computing Architecture for Smart Cities , 2018, Wirel. Commun. Mob. Comput..

[17]  Schahram Dustdar,et al.  Low level Metrics to High level SLAs - LoM2HiS framework: Bridging the gap between monitored metrics and SLA parameters in cloud environments , 2010, 2010 International Conference on High Performance Computing & Simulation.

[18]  Ximeng Liu,et al.  A fully distributed hierarchical attribute-based encryption scheme , 2020, Theor. Comput. Sci..

[19]  Alexander Keller,et al.  SLA-driven management of distributed systems using the common information model , 2003, IFIP/IEEE Eighth International Symposium on Integrated Network Management, 2003..

[20]  Dejan S. Milojicic,et al.  Translating Service Level Objectives to lower level policies for multi-tier services , 2008, Cluster Computing.

[21]  Edward D. Lazowska,et al.  Quantitative system performance - computer system analysis using queueing network models , 1983, Int. CMG Conference.

[22]  David S. Rosenblum,et al.  Design and evaluation of a wide-area event notification service , 2001, TOCS.

[23]  Javier Soriano,et al.  Enabling Large-Scale IoT-Based Services through Elastic Publish/Subscribe , 2017, Sensors.

[24]  Yanlong Zhai,et al.  Efficient Bottleneck Detection in Stream Process System Using Fuzzy Logic Model , 2017, 2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP).

[25]  Emanuel Ferreira Coutinho,et al.  Elasticity in cloud computing: a survey , 2014, annals of telecommunications - annales des télécommunications.

[26]  Huai Liu,et al.  Metamorphic Testing , 2018, ACM Comput. Surv..

[27]  Yang Yang,et al.  Lightning-fast and privacy-preserving outsourced computation in the cloud , 2019, Cybersecur..

[28]  José Antonio Lozano,et al.  A Review of Auto-scaling Techniques for Elastic Applications in Cloud Environments , 2014, Journal of Grid Computing.