Adaptive and Safe Bayesian Optimization in High Dimensions via One-Dimensional Subspaces

Bayesian optimization is known to be difficult to scale to high dimensions, because the acquisition step requires solving a non-convex optimization problem in the same search space. In order to scale the method and keep its benefits, we propose an algorithm (LineBO) that restricts the problem to a sequence of iteratively chosen one-dimensional sub-problems that can be solved efficiently. We show that our algorithm converges globally and obtains a fast local rate when the function is strongly convex. Further, if the objective has an invariant subspace, our method automatically adapts to the effective dimension without changing the algorithm. When combined with the SafeOpt algorithm to solve the sub-problems, we obtain the first safe Bayesian optimization algorithm with theoretical guarantees applicable in high-dimensional settings. We evaluate our method on multiple synthetic benchmarks, where we obtain competitive performance. Further, we deploy our algorithm to optimize the beam intensity of the Swiss Free Electron Laser with up to 40 parameters while satisfying safe operation constraints.

[1]  Andreas Krause,et al.  Information Directed Sampling and Bandits with Heteroscedastic Noise , 2018, COLT.

[2]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[3]  Aditya Gopalan,et al.  On Kernelized Multi-armed Bandits , 2017, ICML.

[4]  Joel W. Burdick,et al.  Stagewise Safe Bayesian Optimization with Gaussian Processes , 2018, ICML.

[5]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[6]  Andreas Krause,et al.  Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics , 2016, Machine Learning.

[7]  Volkan Cevher,et al.  High-Dimensional Bayesian Optimization via Additive Models with Overlapping Groups , 2018, AISTATS.

[8]  Dino Sejdinovic,et al.  Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences , 2018, ArXiv.

[9]  Donald R. Jones,et al.  Direct Global Optimization Algorithm , 2009, Encyclopedia of Optimization.

[10]  Andreas Krause,et al.  High-Dimensional Gaussian Process Bandits , 2013, NIPS.

[11]  Volkan Cevher,et al.  Lower Bounds on Regret for Noisy Gaussian Process Bandit Optimization , 2017, COLT.

[12]  Alkis Gotovos,et al.  Safe Exploration for Optimization with Gaussian Processes , 2015, ICML.

[13]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[14]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[15]  A Stochastic Line Search Method with Convergence Rate Analysis , 2018, 1807.07994.

[16]  Csaba Szepesvari,et al.  Online learning for linearly parametrized control problems , 2012 .

[17]  F. Kozin,et al.  System Modeling and Optimization , 1982 .

[18]  Jonathan Scarlett,et al.  Tight Regret Bounds for Bayesian Optimization in One Dimension , 2018, ICML.

[19]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[20]  M. J. D. Powell,et al.  On search directions for minimization algorithms , 1973, Math. Program..

[21]  Philipp Hennig,et al.  Probabilistic Line Searches for Stochastic Optimization , 2015, NIPS.

[22]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[23]  Stefano Ermon,et al.  Sparse Gaussian Processes for Bayesian Optimization , 2016, UAI.

[24]  Yang Yu,et al.  Derivative-Free Optimization of High-Dimensional Non-Convex Functions by Sequential Random Embeddings , 2016, IJCAI.

[25]  J. Mockus,et al.  The Bayesian approach to global optimization , 1989 .

[26]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[27]  Shalabh Bhatnagar,et al.  Stochastic Recursive Algorithms for Optimization , 2012 .

[28]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[29]  Stephen J. Roberts,et al.  Optimization, fast and slow: optimally switching between local and Bayesian optimization , 2018, ICML.

[30]  Carsten Rockstuhl,et al.  Benchmarking Five Global Optimization Approaches for Nano-optical Shape Optimization and Parameter Reconstruction , 2018, ACS Photonics.

[31]  Zi Wang,et al.  Max-value Entropy Search for Efficient Bayesian Optimization , 2017, ICML.

[32]  Stefano Ermon,et al.  Bayesian Optimization of FEL Performance at LCLS , 2016 .

[33]  J. M. Martínez,et al.  A derivative-free nonmonotone line-search technique for unconstrained optimization , 2008 .

[34]  Petros Koumoutsakos,et al.  Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.

[35]  Ohad Shamir,et al.  On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization , 2012, COLT.

[36]  Nando de Freitas,et al.  Bayesian Optimization in a Billion Dimensions via Random Embeddings , 2013, J. Artif. Intell. Res..

[37]  Sham M. Kakade,et al.  Information Consistency of Nonparametric Gaussian Process Methods , 2008, IEEE Transactions on Information Theory.

[38]  Serge Gratton,et al.  Direct Search Based on Probabilistic Descent , 2015, SIAM J. Optim..

[39]  Katya Scheinberg,et al.  Global convergence rate analysis of unconstrained optimization methods based on probabilistic models , 2015, Mathematical Programming.

[40]  Cheng Li,et al.  High Dimensional Bayesian Optimization using Dropout , 2018, IJCAI.

[41]  Yurii Nesterov,et al.  Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.

[42]  Adam Tauman Kalai,et al.  Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.

[43]  Andreas Krause,et al.  Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features , 2018, NeurIPS.

[44]  Andreas Krause,et al.  Safe learning of regions of attraction for uncertain, nonlinear systems with Gaussian processes , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).