Trading Convergence Rate with Computational Budget in High Dimensional Bayesian Optimization

Scaling Bayesian optimisation (BO) to high-dimensional search spaces is a active and open research problems particularly when no assumptions are made on function structure. The main reason is that at each iteration, BO requires to find global maximisation of acquisition function, which itself is a non-convex optimization problem in the original search space. With growing dimensions, the computational budget for this maximisation gets increasingly short leading to inaccurate solution of the maximisation. This inaccuracy adversely affects both the convergence and the efficiency of BO. We propose a novel approach where the acquisition function only requires maximisation on a discrete set of low dimensional subspaces embedded in the original high-dimensional search space. Our method is free of any low dimensional structure assumption on the function unlike many recent high-dimensional BO methods. Optimising acquisition function in low dimensional subspaces allows our method to obtain accurate solutions within limited computational budget. We show that in spite of this convenience, our algorithm remains convergent. In particular, cumulative regret of our algorithm only grows sub-linearly with the number of iterations. More importantly, as evident from our regret bounds, our algorithm provides a way to trade the convergence rate with the number of subspaces used in the optimisation. Finally, when the number of subspaces is "sufficiently large", our algorithm's cumulative regret is at most O*(√TγT) as opposed to O*(√DTγT) for the GP-UCB of Srinivas et al. (2012), reducing a crucial factor √D where D being the dimensional number of input space. We perform empirical experiments to evaluate our method extensively, showing that its sample efficiency is better than the existing methods for many optimisation problems involving dimensions up to 5000.

[1]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[2]  Andreas Krause,et al.  Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features , 2018, NeurIPS.

[3]  Chun-Liang Li,et al.  High Dimensional Bayesian Optimization via Restricted Projection Pursuit Models , 2016, AISTATS.

[4]  Max Welling,et al.  BOCK : Bayesian Optimization with Cylindrical Kernels , 2018, ICML.

[5]  Yang Yu,et al.  Derivative-Free Optimization of High-Dimensional Non-Convex Functions by Sequential Random Embeddings , 2016, IJCAI.

[6]  Kirthevasan Kandasamy,et al.  High Dimensional Bayesian Optimisation and Bandits via Additive Models , 2015, ICML.

[7]  Cheng Li,et al.  High Dimensional Bayesian Optimization with Elastic Gaussian Process , 2017, ICML.

[8]  Matthias Poloczek,et al.  A Framework for Bayesian Optimization in Embedded Subspaces , 2019, ICML.

[9]  Roman Garnett,et al.  Active Learning of Linear Embeddings for Gaussian Processes , 2013, UAI.

[10]  S. Ghosal,et al.  Posterior consistency of Gaussian process prior for nonparametric binary regression , 2006, math/0702686.

[11]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[12]  Xianfu Wang Volumes of Generalized Unit Balls , 2005 .

[13]  Cheng Li,et al.  High Dimensional Bayesian Optimization using Dropout , 2018, IJCAI.

[14]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[15]  Jonas Mockus,et al.  On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[16]  Edward Chlebus,et al.  An approximate formula for a partial sum of the divergent p-series , 2009, Appl. Math. Lett..

[17]  Steven Su,et al.  High Dimensional Bayesian Optimization via Supervised Dimension Reduction , 2019, IJCAI.

[18]  Andreas Krause,et al.  High-Dimensional Gaussian Process Bandits , 2013, NIPS.

[20]  Nando de Freitas,et al.  Bayesian Optimization in High Dimensions via Random Embeddings , 2013, IJCAI.

[21]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[22]  Volkan Cevher,et al.  High-Dimensional Bayesian Optimization via Additive Models with Overlapping Groups , 2018, AISTATS.

[23]  Kian Hsiang Low,et al.  Decentralized High-Dimensional Bayesian Optimization with Factor Graphs , 2017, AAAI.

[24]  Andrew Gordon Wilson,et al.  Scaling Gaussian Process Regression with Derivatives , 2018, NeurIPS.

[25]  Andreas Krause,et al.  Adaptive and Safe Bayesian Optimization in High Dimensions via One-Dimensional Subspaces , 2019, ICML.