Sparse Spectrum Gaussian Process for Bayesian Optimisation

We propose a novel sparse spectrum approximation of Gaussian process (GP) tailored for Bayesian optimization. Whilst the current sparse spectrum methods provide desired approximations for regression problems, it is observed that this particular form of sparse approximations generates an overconfident GP, i.e. it produces less epistemic uncertainty than the original GP. Since the balance between predictive mean and the predictive variance is the key determinant to the success of Bayesian optimization, the current sparse spectrum methods are less suitable for it. We derive a new regularized marginal likelihood for finding the optimal frequencies to fix this over-confidence issue, particularly for Bayesian optimization. The regularizer trades off the accuracy in the model fitting with a targeted increase in the predictive variance of the resultant GP. Specifically, we use the entropy of the global maximum distribution from the posterior GP as the regularizer that needs to be maximized. Since this distribution cannot be calculated analytically, we first propose a Thompson sampling based approach and then a more efficient sequential Monte Carlo based approach to estimate it. Later, we also show that the Expected Improvement acquisition function can be used as a proxy for the maximum distribution, thus making the whole process further efficient. Experiments show considerable improvement to Bayesian optimization convergence rate over the vanilla sparse spectrum method and over a full GP when its covariance matrix is ill-conditioned due to the presence of a large number of observations.

[1]  Stefano Ermon,et al.  Sparse Gaussian Processes for Bayesian Optimization , 2016, UAI.

[2]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.

[3]  Arno Solin,et al.  Variational Fourier Features for Gaussian Processes , 2016, J. Mach. Learn. Res..

[4]  D. Finkel,et al.  Direct optimization algorithm user guide , 2003 .

[5]  Svetha Venkatesh,et al.  A flexible transfer learning framework for Bayesian optimization with convergence guarantee , 2019, Expert Syst. Appl..

[6]  Carl E. Rasmussen,et al.  Sparse Spectrum Gaussian Process Regression , 2010, J. Mach. Learn. Res..

[7]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[8]  Jonas Mockus,et al.  Application of Bayesian approach to numerical methods of global and stochastic optimization , 1994, J. Glob. Optim..

[9]  N. Saunders,et al.  CALPHAD : calculation of phase diagrams : a comprehensive guide , 1998 .

[10]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[11]  S. Bochner Lectures on Fourier Integrals. (AM-42) , 1959 .

[12]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[13]  Philipp Hennig,et al.  Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..

[14]  Cheng Li,et al.  Efficient Bayesian Optimisation Using Derivative Meta-model , 2018, PRICAI.

[15]  Svetha Venkatesh,et al.  Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation , 2018, NeurIPS.

[16]  Cheng Li,et al.  Sparse Approximation for Gaussian Process with Derivative Observations , 2018, Australasian Conference on Artificial Intelligence.

[17]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[18]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[19]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[20]  Cheng Li,et al.  Rapid Bayesian optimisation for synthesis of short polymer fiber materials , 2017, Scientific Reports.

[21]  L. Höglund,et al.  Thermo-Calc & DICTRA, computational tools for materials science , 2002 .

[22]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[23]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[24]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.