Providing Performance Guarantees for Cloud-Deployed Applications

Applications with a dynamic workload demand need access to a flexible infrastructure to meet performance guarantees and minimize resource costs. While cloud computing provides the elasticity to scale the infrastructure on demand, cloud service providers lack control and visibility of user space applications, making it difficult to accurately scale the infrastructure. Thus, the burden of scaling falls on the user. That is, the user must determine when to trigger scaling and how much to scale. Scaling becomes even more challenging when applications exhibit dynamic changes in their behavior. In this paper, we propose a new cloud service, Dependable Compute Cloud (DC2), that automatically scales the infrastructure to meet the user-specified performance requirements, even when multiple user requests execute concurrently. DC2 employs Kalman filtering to automatically learn the (possibly changing) system parameters for each application, allowing it to proactively scale the infrastructure to meet performance guarantees. Importantly, DC2 is designed for the cloud - it is application-agnostic and does not require any offline application profiling or benchmarking, training data, or expert knowledge about the application. We evaluate DC2 via implementation on OpenStack using a multi-tier application under a range of workload mixes and arrival traces. Our experimental results demonstrate the robustness and superiority of DC2 over existing rule-based approaches with respect to avoiding SLA violations and minimizing resource consumption.

[1]  D. Simon Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches , 2006 .

[2]  Michael I. Jordan,et al.  The SCADS Director: Scaling a Distributed Storage System Under Stringent Performance Requirements , 2011, FAST.

[3]  Richard A. Davis,et al.  Introduction to time series and forecasting , 1998 .

[4]  Johan Tordsson,et al.  PEAS , 2016, ACM Trans. Model. Perform. Evaluation Comput. Syst..

[5]  Isis Truck,et al.  Using Reinforcement Learning for Autonomic Resource Allocation in Clouds: towards a fully automated workflow , 2011 .

[6]  Prashant J. Shenoy,et al.  Agile dynamic provisioning of multi-tier Internet applications , 2008, TAAS.

[7]  Ricardo Bianchini,et al.  DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments , 2013, USENIX Annual Technical Conference.

[8]  Marin Litoiu,et al.  Optimal autoscaling in a IaaS cloud , 2012, ICAC '12.

[9]  Samuel Kounev,et al.  Runtime Vertical Scaling of Virtualized Applications via Online Model Estimation , 2014, 2014 IEEE Eighth International Conference on Self-Adaptive and Self-Organizing Systems.

[10]  Steven Hand,et al.  Self-adaptive and self-configured CPU resource provisioning for virtualized servers using Kalman filters , 2009, ICAC '09.

[11]  Aniruddha S. Gokhale,et al.  Efficient Autoscaling in the Cloud Using Predictive Models for Workload Forecasting , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[12]  J. Ben Atkinson,et al.  An Introduction to Queueing Networks , 1988 .

[13]  Michael H. Kalantar,et al.  Weaver: Language and runtime for software defined environments , 2014, IBM J. Res. Dev..

[14]  Xiaoyun Zhu,et al.  Application-driven dynamic vertical scaling of virtual machines in resource pools , 2014, 2014 IEEE Network Operations and Management Symposium (NOMS).

[15]  Thomas F. Wenisch,et al.  The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services , 2014, OSDI.

[16]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[17]  Xiaohui Gu,et al.  AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service , 2013, ICAC.

[18]  Guy Pujolle,et al.  Introduction to queueing networks , 1987 .

[19]  Mor Harchol-Balter,et al.  AutoScale: Dynamic, Robust Capacity Management for Multi-Tier Data Centers , 2012, TOCS.

[20]  José Antonio Lozano,et al.  A Review of Auto-scaling Techniques for Elastic Applications in Cloud Environments , 2014, Journal of Grid Computing.

[21]  Randy H. Katz,et al.  NapSAC: design and implementation of a power-proportional web cluster , 2010, CCRV.

[22]  Sudipto Guha,et al.  Modeling the Parallel Execution of Black-Box Services , 2011, HotCloud.

[23]  Jerome A. Rolia,et al.  Parameter estimation for performance models of distributed application systems , 1995, CASCON.

[24]  Manish Marwah,et al.  Minimizing data center SLA violations and power consumption via hybrid resource provisioning , 2011, 2011 International Green Computing Conference and Workshops.

[25]  Xiaohui Gu,et al.  CloudScale: elastic resource scaling for multi-tenant cloud systems , 2011, SoCC.

[26]  Samuel Kounev,et al.  Model-based self-adaptive resource allocation in virtualized environments , 2011, SEAMS '11.

[27]  Isis Truck,et al.  From Data Center Resource Allocation to Control Theory and Back , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[28]  Danilo Ardagna,et al.  SLA based profit optimization in autonomic computing systems , 2004, ICSOC '04.

[29]  Le Yi Wang,et al.  VCONF: a reinforcement learning approach to virtual machines auto-configuration , 2009, ICAC '09.

[30]  Marin Litoiu,et al.  Performance Model Estimation and Tracking Using Optimal Filters , 2008, IEEE Transactions on Software Engineering.

[31]  Alfons Kemper,et al.  Adaptive quality of service management for enterprise services , 2008, TWEB.

[32]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[33]  Asser N. Tantawi,et al.  Estimating Model Parameters of Adaptive Software Systems in Real-Time , 2010 .

[34]  José E. Moreira,et al.  Performance Evaluation of a Commercial Application, Trade, in Scale-out Environments , 2007, 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[35]  Marin Litoiu,et al.  Service System Resource Management Based on a Tracked Layered Performance Model , 2006, 2006 IEEE International Conference on Autonomic Computing.

[36]  Parijat Dube,et al.  Modeling the Impact of Workload on Cloud Resource Scaling , 2014, 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing.

[37]  Shicong Meng,et al.  K-Scope: Online Performance Tracking for Dynamic Cloud Applications , 2013, ICAC.

[38]  Michael I. Jordan,et al.  Automating Datacenter Operations Using Machine Learning , 2010 .

[39]  S. F. Yashkov,et al.  Processor-sharing queues: Some progress in analysis , 1987, Queueing Syst. Theory Appl..

[40]  Rajarshi Das,et al.  A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation , 2006, 2006 IEEE International Conference on Autonomic Computing.

[41]  Samuel Kounev,et al.  Self‐adaptive workload classification and forecasting for proactive resource provisioning , 2013, ICPE '13.

[42]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[43]  Asser N. Tantawi,et al.  Performance management for cluster-based web services , 2005, IEEE Journal on Selected Areas in Communications.

[44]  Prashant J. Shenoy,et al.  Empirical evaluation of latency-sensitive application performance in the cloud , 2010, MMSys '10.

[45]  J. Morrison Response-Time Distribution for a Processor-Sharing System , 1985 .

[46]  Asser N. Tantawi,et al.  An analytical model for multi-tier internet services and its applications , 2005, SIGMETRICS '05.