PASCAL: An architecture for proactive auto-scaling of distributed services

Abstract One of the main characteristics that today makes cloud services so popular is their ability to be elastic, i.e., they can adapt their provisioning to variable workloads, thus increasing resource utilization and reducing operating costs. At the core of any elastic service lies an automatic scaling mechanism that drives provisioning on the basis of a given strategy. In this paper we propose PASCAL , an architecture for Proactive Auto-SCALing of generic distributed services. PASCAL combines a proactive approach, to forecast incoming workloads, with a profiling system, to estimate required provision. Scale-in/out operations are decided according to an application-specific strategy, which aims at provisioning the minimum number of resources needed to sustain the foreseen workload. The main novelties introduced with PASCAL architecture are: (i) a strategy to proactively auto-scale a distributed stream processing system (namely, Apache Storm) with the aim of load balancing operators through an accurate system performance estimation model, and (ii) a strategy to proactively auto-scale a distributed datastore (namely, Apache Cassandra), focused on how to choose when executing scaling actions on the basis of the time needed for the activation/deactivation of storage nodes so as to have the configuration ready when needed. We provide a prototype implementation of PASCAL for both use cases and, through an experimental evaluation conducted on a private cloud, we validate our approach and demonstrate the effectiveness of the proposed strategies in terms of saved resources and response time.

[1]  Zhenhuan Gong,et al.  PRESS: PRedictive Elastic ReSource Scaling for cloud systems , 2010, 2010 International Conference on Network and Service Management.

[2]  Marco Aiello,et al.  Optimizing Energy Costs for Offices Connected to the Smart Grid , 2012, IEEE Transactions on Smart Grid.

[3]  Silvia Bonomi,et al.  Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems , 2018, IEEE Transactions on Parallel and Distributed Systems.

[4]  Christof Fetzer,et al.  Auto-scaling techniques for elastic data stream processing , 2014, 2014 IEEE 30th International Conference on Data Engineering Workshops.

[5]  Thomas S. Heinze,et al.  Latency-aware elastic scaling for distributed data stream processing systems , 2014, DEBS '14.

[6]  Michael Y. Hu,et al.  Forecasting with artificial neural networks: The state of the art , 1997 .

[7]  Xiaohui Gu,et al.  AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service , 2013, ICAC.

[8]  Rajkumar Buyya,et al.  A Stepwise Auto-Profiling Method for Performance Optimization of Streaming Applications , 2017, ACM Trans. Auton. Adapt. Syst..

[9]  Yang Wang,et al.  Exalt: Empowering Researchers to Evaluate Large-Scale Storage Systems , 2014, NSDI.

[10]  Qi Zhang,et al.  A Regression-Based Analytic Model for Dynamic Resource Provisioning of Multi-Tier Applications , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[11]  Yang Wang,et al.  Evaluating Scalability Bottlenecks by Workload Extrapolation , 2018, 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).

[12]  Roberto Baldoni,et al.  Towards a Non-intrusive Recognition of Anomalous System Behavior in Data Centers , 2014, SAFECOMP Workshops.

[13]  Lars Lundberg,et al.  Energy-aware auto-scaling algorithms for Cassandra virtual data centers , 2017, Cluster Computing.

[14]  Isis Truck,et al.  Using Reinforcement Learning for Autonomic Resource Allocation in Clouds: towards a fully automated workflow , 2011 .

[15]  Kathryn Bean,et al.  Transforming reactive auto-scaling into proactive auto-scaling , 2013, CloudDP '13.

[16]  Thomas S. Heinze,et al.  Online parameter optimization for elastic data stream processing , 2015, SoCC.

[17]  Jóakim von Kistowski,et al.  Defining and Quantifying Elasticity of Resources in Cloud Computing and Scalable Platforms , 2011 .

[18]  Indranil Gupta,et al.  Stela: Enabling Stream Processing Systems to Scale-in and Scale-out On-demand , 2016, 2016 IEEE International Conference on Cloud Engineering (IC2E).

[19]  Prashant J. Shenoy,et al.  Agile dynamic provisioning of multi-tier Internet applications , 2008, TAAS.

[20]  Le Yi Wang,et al.  VCONF: a reinforcement learning approach to virtual machines auto-configuration , 2009, ICAC '09.

[21]  Kevin Lee,et al.  Empirical prediction models for adaptive resource provisioning in the cloud , 2012, Future Gener. Comput. Syst..

[22]  Moustafa Ghanem,et al.  Lightweight Resource Scaling for Cloud Applications , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[23]  Kang G. Shin,et al.  Automated control of multiple virtualized resources , 2009, EuroSys '09.

[24]  Maarten van Steen,et al.  Cost-Effective Resource Allocation for Deploying Pub/Sub on Cloud , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[25]  Allegra Mullan,et al.  Storm , 2019, The Keats-Shelley Review.

[26]  Roberto Baldoni,et al.  NIRVANA: A Non-intrusive Black-Box Monitoring Framework for Rack-Level Fault Detection , 2015, 2015 IEEE 21st Pacific Rim International Symposium on Dependable Computing (PRDC).

[27]  Robert Grimm,et al.  A catalog of stream processing optimizations , 2014, ACM Comput. Surv..

[28]  Michael Gerndt,et al.  Autoscaling Performance Measurement Tool , 2018, ICPE Companion.

[29]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[30]  Waheed Iqbal,et al.  Adaptive resource provisioning for read intensive multi-tier applications in the cloud , 2011, Future Gener. Comput. Syst..

[31]  Alexander Clemm,et al.  Integrated and autonomic cloud resource scaling , 2012, 2012 IEEE Network Operations and Management Symposium.

[32]  Thomas S. Heinze,et al.  Cloud-based data stream processing , 2014, DEBS '14.

[33]  Dimitris K. Tasoulis,et al.  Time Series Forecasting Methodology for Multiple-Step-Ahead Prediction , 2005, Computational Intelligence.

[34]  Chia Chun Shih,et al.  The improvement of auto-scaling mechanism for distributed database - A case study for MongoDB , 2013, 2013 15th Asia-Pacific Network Operations and Management Symposium (APNOMS).

[35]  Vladimir Vlassov,et al.  ElastMan: elasticity manager for elastic key-value stores in the cloud , 2013, CAC.

[36]  Stanley B. Zdonik,et al.  Dealing with Overload in Distributed Stream Processing Systems , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[37]  Roberto Baldoni,et al.  An Architecture for Automatic Scaling of Replicated Services , 2014, NETYS.

[38]  Fung Po Tso,et al.  Scalable Traffic-Aware Virtual Machine Management for Cloud Data Centers , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[39]  Bu-Sung Lee,et al.  Cost Minimization for Provisioning Virtual Servers in Amazon Elastic Compute Cloud , 2011, 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems.

[40]  Johan Tordsson,et al.  An adaptive hybrid elasticity controller for cloud infrastructures , 2012, 2012 IEEE Network Operations and Management Symposium.

[41]  Toyotaro Suzumura,et al.  Elastic Stream Computing with Clouds , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[42]  Federico Lombardi A Proactive Q-Learning Approach for Autoscaling Heterogeneous Cloud Servers , 2018, 2018 14th European Dependable Computing Conference (EDCC).

[43]  Aniruddha S. Gokhale,et al.  Efficient Autoscaling in the Cloud Using Predictive Models for Workload Forecasting , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[44]  Parijat Dube,et al.  Adaptive, Model-driven Autoscaling for Cloud Applications , 2014, ICAC.

[45]  Marco Danelutto,et al.  Elastic-PPQ: A two-level autonomic system for spatial preference query processing over dynamic data streams , 2018, Future Gener. Comput. Syst..

[46]  Martin Hirzel,et al.  Tutorial: stream processing optimizations , 2013, DEBS.

[47]  Nicolas Hidalgo,et al.  Self-adaptive processing graph with operator fission for elastic stream processing , 2017, J. Syst. Softw..

[48]  Michael I. Jordan,et al.  Statistical Machine Learning Makes Automatic Control Practical for Internet Datacenters , 2009, HotCloud.

[49]  Ittai Abraham,et al.  Replex: A Scalable, Highly Available Multi-Index Data Store , 2016, USENIX Annual Technical Conference.

[50]  Valeria Cardellini,et al.  Decentralized self-adaptation for elastic Data Stream Processing , 2018, Future Gener. Comput. Syst..

[51]  Marin Litoiu,et al.  Optimal autoscaling in a IaaS cloud , 2012, ICAC '12.

[52]  Yanlong Zhai,et al.  Efficient Bottleneck Detection in Stream Process System Using Fuzzy Logic Model , 2017, 2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP).

[53]  Rizos Sakellariou,et al.  Enacting SLAs in Clouds Using Rules , 2011, Euro-Par.

[54]  José Antonio Lozano,et al.  A Review of Auto-scaling Techniques for Elastic Applications in Cloud Environments , 2014, Journal of Grid Computing.

[55]  Rodrigo Nogueira,et al.  CloudBFT: Elastic Byzantine Fault Tolerance , 2014, 2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing.

[56]  Raul Castro Fernandez,et al.  Integrating scale out and fault tolerance in stream processing using operator state management , 2013, SIGMOD '13.

[57]  Schahram Dustdar,et al.  Elastic Stream Processing for the Internet of Things , 2016, 2016 IEEE 9th International Conference on Cloud Computing (CLOUD).

[58]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[59]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[60]  Prashant J. Shenoy,et al.  ShuttleDB: Database-Aware Elasticity in the Cloud , 2014, ICAC.

[61]  Cristiano André da Costa,et al.  AutoElastic: Automatic Resource Elasticity for High Performance Applications in the Cloud , 2016, IEEE Transactions on Cloud Computing.

[62]  Mohammad Kazem Akbari,et al.  Survey on improved Autoscaling in Hadoop into cloud environments , 2013, The 5th Conference on Information and Knowledge Technology.

[63]  Enda Barrett,et al.  Applying reinforcement learning towards automating resource allocation and application scalability in the cloud , 2013, Concurr. Comput. Pract. Exp..

[64]  Xiaohui Gu,et al.  CloudScale: elastic resource scaling for multi-tenant cloud systems , 2011, SoCC.

[65]  Ali Ghodsi,et al.  Drizzle: Fast and Adaptable Stream Processing at Scale , 2017, SOSP.