Do the best cloud configurations grow on trees?

Cloud configuration optimization is the procedure to determine the number and the type of instances to use when deploying an application in cloud environments, given a cost or performance objective. In the absence of a performance model for the distributed application, black-box optimization can be used to perform automatic cloud configuration. Numerous black-box optimization algorithms have been developed; however, their comparative evaluation has so far been limited to the hyper-parameter optimization setting, which differs significantly from the cloud configuration problem. In this paper, we evaluate 8 commonly used black-box optimization algorithms to determine their applicability for the cloud configuration problem. Our evaluation, using 23 different workloads, shows that in several cases Bayesian optimization with Gradient boosted regression trees performs better than methods chosen by prior work.

[1]  G. Gary Wang,et al.  Survey of modeling and optimization strategies to solve high-dimensional design problems with computationally-expensive black-box functions , 2010 .

[2]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[3]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[4]  Alan C. Elliott,et al.  Statistical Analysis Quick Reference Guidebook: With SPSS Examples , 2006 .

[5]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[6]  David D. Cox,et al.  Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[7]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[8]  Nando de Freitas,et al.  Portfolio Allocation for Bayesian Optimization , 2010, UAI.

[9]  Tim Menzies,et al.  Arrow: Low-Level Augmented Bayesian Optimization for Finding the Best Cloud VM , 2017, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[10]  Neil D. Lawrence,et al.  Batch Bayesian Optimization via Local Penalization , 2015, AISTATS.

[11]  Randy H. Katz,et al.  Selecting the best VM across multiple public clouds: a data-driven performance modeling approach , 2017, SoCC.

[12]  Kevin Leyton-Brown,et al.  An Efficient Approach for Assessing Hyperparameter Importance , 2014, ICML.

[13]  Tim Menzies,et al.  Scout: An Experienced Guide to Find the Best Cloud Configuration , 2018, ArXiv.

[14]  Yuqing Zhu,et al.  BestConfig: tapping the performance potential of systems via automatic configuration tuning , 2017, SoCC.

[15]  Peter I. Frazier,et al.  Parallel Bayesian Global Optimization of Expensive Functions , 2016, Oper. Res..

[16]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  Bowei Xi,et al.  A smart hill-climbing algorithm for application server configuration , 2004, WWW '04.

[19]  D. Sculley,et al.  Google Vizier: A Service for Black-Box Optimization , 2017, KDD.

[20]  Ion Stoica,et al.  Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics , 2016, NSDI.

[21]  Minlan Yu,et al.  CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics , 2017, NSDI.

[22]  Aaron Klein,et al.  Hyperparameter Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[23]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[24]  Shivnath Babu,et al.  Tuning Database Configuration Parameters with iTuned , 2009, Proc. VLDB Endow..

[25]  Josef Schwarz,et al.  The Parallel Bayesian Optimization Algorithm , 2000 .

[26]  Mario A. Muñoz,et al.  Algorithm selection for black-box continuous optimization problems: A survey on methods and challenges , 2015, Inf. Sci..

[27]  Carlos Ansótegui,et al.  A Gender-Based Genetic Algorithm for the Automatic Configuration of Algorithms , 2009, CP.

[28]  Marco Canini,et al.  Towards automatic parameter tuning of stream processing systems , 2017, SoCC.

[29]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[30]  Tim Menzies,et al.  Micky: A Cheaper Alternative for Selecting Cloud Instances , 2018, 2018 IEEE 11th International Conference on Cloud Computing (CLOUD).