Model-driven optimal resource scaling in cloud

Cloud computing offers the flexibility to dynamically size the infrastructure in response to changes in workload demand. While both horizontal scaling and vertical scaling of infrastructure are supported by major cloud providers, these scaling options differ significantly in terms of their cost, provisioning time, and their impact on workload performance. Importantly, the efficacy of horizontal and vertical scaling critically depends on the workload characteristics, such as the workload’s parallelizability and its core scalability. In today’s cloud systems, the scaling decision is left to the users, requiring them to fully understand the trade-offs associated with the different scaling options. In this paper, we present our solution for optimizing the resource scaling of cloud deployments via implementation in OpenStack. The key component of our solution is the modeling engine that characterizes the workload and then quantitatively evaluates different scaling options for that workload. Our modeling engine leverages Amdahl’s Law to model service timescaling in scale-up environments and queueing-theoretic concepts to model performance scaling in scale-out environments. We further employ Kalman filtering to account for inaccuracies in the model-based methodology and to dynamically track changes in the workload and cloud environment.

[1]  Gernot Heiser,et al.  Dynamic voltage and frequency scaling: the laws of diminishing returns , 2010 .

[2]  Toshio Nakatani,et al.  Performance of multi-process and multi-thread processing on multi-core SMT processors , 2010, IEEE International Symposium on Workload Characterization (IISWC'10).

[3]  Steven Hand,et al.  Self-adaptive and self-configured CPU resource provisioning for virtualized servers using Kalman filters , 2009, ICAC '09.

[4]  José E. Moreira,et al.  Performance Evaluation of a Commercial Application, Trade, in Scale-out Environments , 2007, 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[5]  Roy H. Campbell,et al.  ARIA: automatic resource inference and allocation for mapreduce environments , 2011, ICAC '11.

[6]  Michael Jones,et al.  Exploring Small-Scale and Large-Scale CMP Architectures for Commercial Java Servers , 2006, 2006 IEEE International Symposium on Workload Characterization.

[7]  Kunle Olukotun,et al.  Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.

[8]  Erik Elmroth,et al.  A virtual machine re-packing approach to the horizontal vs. vertical elasticity trade-off for cloud autoscaling , 2013, CAC.

[9]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[10]  Mark D. Hill,et al.  Amdahl's Law in the Multicore Era , 2008, Computer.

[11]  Asser N. Tantawi,et al.  Estimating Model Parameters of Adaptive Software Systems in Real-Time , 2010 .

[12]  Parijat Dube,et al.  Adaptive, Model-driven Autoscaling for Cloud Applications , 2014, ICAC.

[13]  Guy Pujolle,et al.  Introduction to queueing networks , 1987 .

[14]  Toshio Nakatani,et al.  Performance Studies of Commercial Workloads on a Multi-core System , 2007, 2007 IEEE 10th International Symposium on Workload Characterization.

[15]  Wei Tan,et al.  Evaluation of Multi-core Scalability Bottlenecks in Enterprise Java Workloads , 2012, 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[16]  Danilo Ardagna,et al.  Run-time Models for Self-managing Systems and Applications , 2010 .

[17]  Karl Aberer,et al.  Autonomic SLA-Driven Provisioning for Cloud Applications , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[18]  J. Ben Atkinson,et al.  An Introduction to Queueing Networks , 1988 .

[19]  Antony Rowstron,et al.  Nobody ever got fired for using Hadoop on a cluster , 2012, HotCDP '12.

[20]  David Mosberger,et al.  httperf—a tool for measuring web server performance , 1998, PERV.

[21]  Sumit Mittal,et al.  Caching Dynamic Web Content: Designing and Analysing an Aspect-Oriented Solution , 2006, Middleware.

[22]  Parijat Dube,et al.  Model-Driven Autoscaling for Hadoop Clusters , 2015, 2015 IEEE International Conference on Autonomic Computing.

[23]  Mor Harchol-Balter,et al.  AutoScale: Dynamic, Robust Capacity Management for Multi-Tier Data Centers , 2012, TOCS.

[24]  Alexandru Iosup,et al.  Balanced resource allocations across multiple dynamic MapReduce clusters , 2014, SIGMETRICS '14.

[25]  Wei Huang,et al.  A study of Java virtual machine scalability issues on SMP systems , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..

[26]  Maged M. Michael,et al.  Scale-up x Scale-out: A Case Study using Nutch/Lucene , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[27]  José E. Moreira,et al.  Performance Studies of a WebSphere Application, Trade, in Scale-out and Scale-up Environments , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[28]  Antony I. T. Rowstron,et al.  Scale-up vs scale-out for Hadoop: time to rethink? , 2013, SoCC.

[29]  Randy H. Katz,et al.  NapSAC: design and implementation of a power-proportional web cluster , 2010, CCRV.

[30]  D. Simon Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches , 2006 .

[31]  Carlos Maltzahn,et al.  A framework for an in-depth comparison of scale-up and scale-out , 2013, DISCS-2013.

[32]  Asser N. Tantawi,et al.  An analytical model for multi-tier internet services and its applications , 2005, SIGMETRICS '05.

[33]  Moriyoshi Ohara,et al.  The data-centricity of Web 2.0 workloads and its impact on server performance , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[34]  Dick H. J. Epema,et al.  Resource Management for Dynamic MapReduce Clusters in Multicluster Systems , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[35]  Keke Chen,et al.  CRESP: Towards Optimal Resource Provisioning for MapReduce Computing in Public Clouds , 2014, IEEE Transactions on Parallel and Distributed Systems.

[36]  Toshio Nakatani,et al.  Analyzing and improving performance scalability of commercial server workloads on a chip multiprocessor , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[37]  Maged M. Michael,et al.  Scalability of the Nutch search engine , 2007, ICS '07.

[38]  Rajkumar Buyya,et al.  Dynamically scaling applications in the cloud , 2011, CCRV.

[39]  Steven Hand,et al.  The Seven Deadly Sins of Cloud Computing Research , 2012, HotCloud.

[40]  Yonggang Hu,et al.  DynMR: dynamic MapReduce with ReduceTask interleaving and MapTask backfilling , 2014, EuroSys '14.

[41]  Paul Brebner,et al.  How scalable is J2EE technology? , 2003, SOEN.

[42]  Endong Wang,et al.  Intel Math Kernel Library , 2014 .

[43]  Waheed Iqbal,et al.  SLA-Driven Dynamic Resource Management for Multi-tier Web Applications in a Cloud , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[44]  Liang Dong,et al.  Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.