Learning How to Optimize Black-Box Functions With Extreme Limits on the Number of Function Evaluations

We consider black-box optimization in which only an extremely limited number of function evaluations, on the order of around 100, are affordable and the function evaluations must be performed in even fewer batches of a limited number of parallel trials. This is a typical scenario when optimizing variable settings that are very costly to evaluate, for example in the context of simulation-based optimization or machine learning hyperparameterization. We propose an original method that uses established approaches to propose a set of points for each batch and then down-selects from these candidate points to the number of trials that can be run in parallel. The key novelty of our approach lies in the introduction of a hyperparameterized method for down-selecting the number of candidates to the allowed batch-size, which is optimized offline using automated algorithm configuration. We tune this method for black box optimization and then evaluate on classical black box optimization benchmarks. Our results show that it is possible to learn how to combine evaluation points suggested by highly diverse black box optimization methods conditioned on the progress of the optimization. Compared with the state of the art in black box minimization and various other methods specifically geared towards few-shot minimization, we achieve an average reduction of 50% of normalized cost, which is a highly significant improvement in performance.

[1]  Haitham Bou-Ammar,et al.  HEBO: Heteroscedastic Evolutionary Bayesian Optimisation , 2020, ArXiv.

[2]  Robert L. Mason,et al.  Fractional factorial design , 2009 .

[3]  Alex S. Fukunaga,et al.  Improving the search performance of SHADE using linear population size reduction , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[4]  Josif Grabocka,et al.  Few-Shot Bayesian Optimization with Deep Kernel Surrogates , 2021, ICLR.

[5]  Frank Hutter,et al.  Hyperparameter Optimization , 2019, Automated Machine Learning.

[6]  Alex S. Fukunaga,et al.  Success-history based parameter adaptation for Differential Evolution , 2013, 2013 IEEE Congress on Evolutionary Computation.

[7]  Pengcheng Ye,et al.  Ensemble of surrogate based global optimization methods using hierarchical design space reduction , 2018 .

[8]  Petros Koumoutsakos,et al.  Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.

[9]  Martin Pál,et al.  Contextual Multi-Armed Bandits , 2010, AISTATS.

[10]  A. Kai Qin,et al.  Self-adaptive differential evolution algorithm for numerical optimization , 2005, 2005 IEEE Congress on Evolutionary Computation.

[11]  Matthias Poloczek,et al.  Scalable Global Optimization via Local Bayesian Optimization , 2019, NeurIPS.

[12]  Carlos Ansótegui,et al.  Hyper-Reactive Tabu Search for MaxSAT , 2018, LION.

[13]  Aaron Klein,et al.  Hyperparameter Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[14]  Thomas Bäck,et al.  An Overview of Evolutionary Algorithms for Parameter Optimization , 1993, Evolutionary Computation.

[15]  Ponnuthurai N. Suganthan,et al.  Ensemble sinusoidal differential covariance matrix adaptation with Euclidean neighborhood for solving CEC2017 benchmark problems , 2017, 2017 IEEE Congress on Evolutionary Computation (CEC).

[16]  Yuri Malitsky,et al.  Model-Based Genetic Algorithms for Algorithm Configuration , 2015, IJCAI.

[17]  Peter I. Frazier,et al.  A Tutorial on Bayesian Optimization , 2018, ArXiv.

[18]  Gilberto Titericz,et al.  GPU Accelerated Exhaustive Search for Optimal Ensemble of Black-Box Optimization Algorithms , 2020, ArXiv.

[19]  Janez Brest,et al.  iL-SHADE: Improved L-SHADE algorithm for single objective real-parameter optimization , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[20]  Panos M. Pardalos,et al.  No Free Lunch Theorem: A Review , 2019, Approximation and Optimization.

[21]  Thomas Jansen,et al.  On the analysis of the (1+1) evolutionary algorithm , 2002, Theor. Comput. Sci..

[22]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[23]  Eyke Hüllermeier,et al.  Online Preselection with Context Information under the Plackett-Luce Model , 2020, ArXiv.

[24]  Anne Auger,et al.  COCO: a platform for comparing continuous optimizers in a black-box setting , 2016, Optim. Methods Softw..

[25]  Liping Wang,et al.  Bayesian task embedding for few-shot Bayesian optimization , 2020, AIAA Scitech 2020 Forum.

[26]  Carlos Ansótegui,et al.  Reactive Dialectic Search Portfolios for MaxSAT , 2017, AAAI.

[27]  Janez Brest,et al.  Single objective real-parameter optimization: Algorithm jSO , 2017, 2017 IEEE Congress on Evolutionary Computation (CEC).

[28]  Carlos Ansótegui,et al.  Boosting evolutionary algorithm configuration , 2021, Annals of Mathematics and Artificial Intelligence.

[29]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[30]  Shuai Li,et al.  TopRank: A practical algorithm for online stochastic ranking , 2018, NeurIPS.

[31]  Bernd Bischl,et al.  ASlib: A benchmark library for algorithm selection , 2015, Artif. Intell..

[32]  Carlos Ansótegui,et al.  A Gender-Based Genetic Algorithm for the Automatic Configuration of Algorithms , 2009, CP.

[33]  Fred W. Glover,et al.  A Template for Scatter Search and Path Relinking , 1997, Artificial Evolution.

[34]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..