Sequential vs. Integrated Algorithm Selection and Configuration: A Case Study for the Modular CMA-ES

When faced with a specific optimization problem, choosing which algorithm to use is always a tough task. Not only is there a vast variety of algorithms to select from, but these algorithms often are controlled by many hyperparameters, which need to be tuned in order to achieve the best performance possible. Usually, this problem is separated into two parts: algorithm selection and algorithm configuration. With the significant advances made in Machine Learning, however, these problems can be integrated into a combined algorithm selection and hyperparameter optimization task, commonly known as the CASH problem. In this work we compare sequential and integrated algorithm selection and configuration approaches for the case of selecting and tuning the best out of 4608 variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) tested on the Black Box Optimization Benchmark (BBOB) suite. We first show that the ranking of the modular CMA-ES variants depends to a large extent on the quality of the hyperparameters. This implies that even a sequential approach based on complete enumeration of the algorithm space will likely result in sub-optimal solutions. In fact, we show that the integrated approach manages to provide competitive results at a much smaller computational cost. We also compare two different mixed-integer algorithm configuration techniques, called irace and Mixed-Integer Parallel Efficient Global Optimization (MIP-EGO). While we show that the two methods differ significantly in their treatment of the exploration-exploitation balance, their overall performances are very similar.

[1]  Thomas Bäck,et al.  Online selection of CMA-ES variants , 2019, GECCO.

[2]  Hao Wang,et al.  Cooling Strategies for the Moment-Generating Function in Bayesian Global Optimization , 2018, 2018 IEEE Congress on Evolutionary Computation (CEC).

[3]  Martin Holena,et al.  Landscape analysis of gaussian process surrogates for the covariance matrix adaptation evolution strategy , 2019, GECCO.

[4]  Hao Wang,et al.  IOHprofiler: A Benchmarking and Profiling Tool for Iterative Optimization Heuristics , 2018, ArXiv.

[5]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[6]  Nikolaus Hansen,et al.  Benchmarking a BI-population CMA-ES on the BBOB-2009 function testbed , 2009, GECCO '09.

[7]  Nikolaus Hansen,et al.  Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[8]  Bernd Bischl,et al.  Exploratory landscape analysis , 2011, GECCO '11.

[9]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[10]  Anne Auger,et al.  Mirrored Sampling and Sequential Selection for Evolution Strategies , 2010, PPSN.

[11]  Anne Auger,et al.  Comparing results of 31 algorithms from the black-box optimization benchmarking BBOB-2009 , 2010, GECCO '10.

[12]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[13]  Mauro Birattari,et al.  The irace Package: Iterated Race for Automatic Algorithm , 2011 .

[14]  Nikolaus Hansen,et al.  CMA-ES with Two-Point Step-Size Adaptation , 2008, ArXiv.

[15]  Marc Schoenauer,et al.  Per instance algorithm configuration of CMA-ES with limited budget , 2017, GECCO.

[16]  Johann Dréo,et al.  Expressiveness and robustness of landscape features , 2019, GECCO.

[17]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[18]  Thomas Bäck,et al.  Towards an Adaptive CMA-ES Configurator , 2018, PPSN.

[19]  Marius Lindauer,et al.  Pitfalls and Best Practices in Algorithm Configuration , 2017, J. Artif. Intell. Res..

[20]  Anne Auger,et al.  Mirrored sampling in evolution strategies with weighted recombination , 2011, GECCO '11.

[21]  Dirk V. Arnold,et al.  Improving Evolution Strategies through Active Covariance Matrix Adaptation , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[22]  Antonio Bolufé Röhler,et al.  Evolution strategies with thresheld convergence , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[23]  Xin Guo,et al.  A New Approach Towards the Combined Algorithm Selection and Hyper-parameter Optimization Problem , 2019, 2019 IEEE Symposium Series on Computational Intelligence (SSCI).

[24]  Hao Wang,et al.  Mirrored orthogonal sampling with pairwise selection in evolution strategies , 2014, SAC.

[25]  Thomas Bartz-Beielstein,et al.  SPOT: An R Package For Automatic and Interactive Tuning of Optimization Algorithms by Sequential Parameter Optimization , 2010, ArXiv.

[26]  Aaron Klein,et al.  BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[27]  Anne Auger,et al.  COCO: Performance Assessment , 2016, ArXiv.

[28]  Heike Trautmann,et al.  The R-Package FLACCO for exploratory landscape analysis with applications to multi-objective optimization problems , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[29]  Olivier Teytaud,et al.  Algorithms (X, sigma, eta): Quasi-random Mutations for Evolution Strategies , 2005, Artificial Evolution.

[30]  Leslie Pérez Cáceres,et al.  An Experimental Study of Adaptive Capping in irace , 2017, LION.

[31]  Thomas Bäck,et al.  Contemporary Evolution Strategies , 2013, Natural Computing Series.

[32]  Nikolaus Hansen,et al.  A restart CMA evolution strategy with increasing population size , 2005, 2005 IEEE Congress on Evolutionary Computation.

[33]  Zbigniew Michalewicz,et al.  Parameter Setting in Evolutionary Algorithms , 2007, Studies in Computational Intelligence.

[34]  Hao Wang,et al.  Evolving the structure of Evolution Strategies , 2016, 2016 IEEE Symposium Series on Computational Intelligence (SSCI).

[35]  Heike Trautmann,et al.  Automated Algorithm Selection: Survey and Perspectives , 2018, Evolutionary Computation.

[36]  Carola Doerr,et al.  Adaptive landscape analysis , 2019, GECCO.

[37]  Lars Kotthoff,et al.  Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA , 2017, J. Mach. Learn. Res..

[38]  Hao Wang,et al.  A new acquisition function for Bayesian optimization based on the moment-generating function , 2017, 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[39]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.