Micky: A Cheaper Alternative for Selecting Cloud Instances

Most cloud computing optimizers explore and improve one workload at a time. When optimizing many workloads, the single-optimizer approach can be prohibitively expensive. Accordingly, we examine "collective optimizer" that concurrently explore and improve a set of workloads significantly reducing the measurement costs. Our large-scale empirical study shows that there is often a single cloud configuration which is surprisingly near-optimal for most workloads. Consequently, we create a collective-optimizer, MICKY, that reformulates the task of finding the near-optimal cloud configuration as a multi-armed bandit problem. MICKY efficiently balances exploration (of new cloud configurations) and exploitation (of known good cloud configuration). Our experiments show that MICKY can achieve on average 8.6 times reduction in measurement cost as compared to the state-of-the-art method while finding near-optimal solutions. Hence we propose MICKY as the basis of a practical collective optimization method for finding good cloud configurations (based on various constraints such as budget and tolerance to near-optimal configurations)

[1]  Don S. Batory,et al.  Finding near-optimal configurations in product lines by random sampling , 2017, ESEC/SIGSOFT FSE.

[2]  Ion Stoica,et al.  Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics , 2016, NSDI.

[3]  Valentin Dalibard,et al.  BOAT: Building Auto-Tuners with Structured Bayesian Optimization , 2017, WWW.

[4]  Ricardo Bianchini,et al.  Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms , 2017, SOSP.

[5]  Shijie Sun,et al.  Pytheas: Enabling Data-Driven Quality of Experience Optimization Using Group-Based Exploration-Exploitation , 2017, NSDI.

[6]  Minlan Yu,et al.  CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics , 2017, NSDI.

[7]  Marco Canini,et al.  Towards automatic parameter tuning of stream processing systems , 2017, SoCC.

[8]  Liang Dong,et al.  Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.

[9]  Randy H. Katz,et al.  Selecting the best VM across multiple public clouds: a data-driven performance modeling approach , 2017, SoCC.

[10]  Johanne Cohen,et al.  Load Prediction for Energy-Aware Scheduling for Cloud Computing Platforms , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[11]  Moo-Ryong Ra,et al.  Inside-Out: Reliable Performance Prediction for Distributed Storage Systems in the Cloud , 2016, 2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS).

[12]  R. Weber On the Gittins Index for Multiarmed Bandits , 1992 .

[13]  Ian Sommerville,et al.  Cloud Migration: A Case Study of Migrating an Enterprise IT System to IaaS , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[14]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[15]  Tim Menzies,et al.  Arrow: Low-Level Augmented Bayesian Optimization for Finding the Best Cloud VM , 2017, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[16]  Aaron Klein,et al.  Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets , 2016, AISTATS.

[17]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[18]  Sven Apel,et al.  Using bad learners to find good configurations , 2017, ESEC/SIGSOFT FSE.

[19]  Tim Menzies,et al.  Scout: An Experienced Guide to Find the Best Cloud Configuration , 2018, ArXiv.

[20]  Yuqing Zhu,et al.  BestConfig: tapping the performance potential of systems via automatic configuration tuning , 2017, SoCC.

[21]  Sven Apel,et al.  Finding Faster Configurations Using FLASH , 2018, IEEE Transactions on Software Engineering.

[22]  M. Mohri,et al.  Bandit Problems , 2006 .

[23]  D. Sculley,et al.  Google Vizier: A Service for Black-Box Optimization , 2017, KDD.

[24]  Anees Shaikh,et al.  Are clouds ready for large distributed applications? , 2010, OPSR.