Provably Efficient Bayesian Optimization with Unbiased Gaussian Process Hyperparameter Estimation

Gaussian process (GP) based Bayesian optimization (BO) is a powerful method for optimizing black-box functions efficiently. The practical performance and theoretical guarantees associated with this approach depend on having the correct GP hyperparameter values, which are usually unknown in advance and need to be estimated from the observed data. However, in practice, these estimations could be incorrect due to biased data sampling strategies commonly used in BO. This can lead to degraded performance and break the sub-linear global convergence guarantee of BO. To address this issue, we propose a new BO method that can sub-linearly converge to the global optimum of the objective function even when the true GP hyperparameters are unknown in advance and need to be estimated from the observed data. Our method uses a multi-armed bandit technique (EXP3) to add random data points to the BO process, and employs a novel training loss function for the GP hyperparameter estimation process that ensures unbiased estimation from the observed data. We further provide theoretical analysis of our proposed method. Finally, we demonstrate empirically that our method outperforms existing approaches on various synthetic and real-world problems.

[1]  F. Hutter,et al.  Self-Correcting Bayesian Optimization through Bayesian Active Learning , 2023, ArXiv.

[2]  D. Block,et al.  Multi‐information source Bayesian optimization of culture media for cellular agriculture , 2022, Biotechnology and bioengineering.

[3]  F. Hutter,et al.  πBO: Augmenting Acquisition Functions with User Beliefs for Bayesian Optimization , 2022, ICLR.

[4]  F. Hutter,et al.  Automated Reinforcement Learning (AutoRL): A Survey and Open Problems , 2022, J. Artif. Intell. Res..

[5]  Andreas Krause,et al.  Misspecified Gaussian Process Bandit Optimization , 2021, NeurIPS.

[6]  Luigi Nardi,et al.  LassoBench: A High-Dimensional Hyperparameter Optimization Benchmark Suite for Lasso , 2021, AutoML.

[7]  Aaron Klein,et al.  HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO , 2021, NeurIPS Datasets and Benchmarks.

[8]  Jonathan Scarlett,et al.  Lenient Regret and Good-Action Identification in Gaussian Process Bandits , 2021, ICML.

[9]  Y. Gal,et al.  On Statistical Bias In Active Learning: How and When To Fix It , 2021, ICLR.

[10]  H. Ammar,et al.  HEBO Pushing The Limits of Sample-Efficient Hyperparameter Optimisation , 2020, 2012.03826.

[11]  Riley J. Hickman,et al.  Gryffin: An algorithm for Bayesian optimization of categorical variables informed by expert knowledge , 2020, 2003.12127.

[12]  Stephen Roberts,et al.  Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits , 2020, NeurIPS.

[13]  Matthias Poloczek,et al.  Scalable Global Optimization via Local Bayesian Optimization , 2019, NeurIPS.

[14]  Sandra Hirche,et al.  Uniform Error Bounds for Gaussian Process Regression with Application to Safe Control , 2019, NeurIPS.

[15]  Andreas Krause,et al.  No-regret Bayesian Optimization with Unknown Hyperparameters , 2019, J. Mach. Learn. Res..

[16]  Ian Gibson,et al.  Accelerating Experimental Design by Incorporating Experimenter Hunches , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[17]  Peter I. Frazier,et al.  A Tutorial on Bayesian Optimization , 2018, ArXiv.

[18]  Fan Yang,et al.  Batch Bayesian Optimization via Multi-objective Acquisition Ensemble for Automated Analog Circuit Design , 2018, ICML.

[19]  Jonathan Scarlett,et al.  Tight Regret Bounds for Bayesian Optimization in One Dimension , 2018, ICML.

[20]  Kirthevasan Kandasamy,et al.  Neural Architecture Search with Bayesian Optimisation and Optimal Transport , 2018, NeurIPS.

[21]  Yutaka Akiyama,et al.  Efficient hyperparameter optimization by using Bayesian optimization for drug-target interaction prediction , 2017, 2017 IEEE 7th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS).

[22]  Matthias W. Seeger,et al.  Bayesian Optimization with Tree-structured Dependencies , 2017, ICML.

[23]  Aaron Klein,et al.  Bayesian Optimization with Robust Bayesian Neural Networks , 2016, NIPS.

[24]  Andreas Krause,et al.  Truncated Variance Reduction: A Unified Approach to Bayesian Optimization and Level-Set Estimation , 2016, NIPS.

[25]  Koji Tsuda,et al.  COMBO: An efficient Bayesian optimization library for materials science , 2016 .

[26]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[27]  Volkan Cevher,et al.  Time-Varying Gaussian Process Bandit Optimization , 2016, AISTATS.

[28]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[29]  Nando de Freitas,et al.  Theoretical Analysis of Bayesian Optimisation with Unknown Gaussian Process Hyper-Parameters , 2014, ArXiv.

[30]  François Bachoc,et al.  Cross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes with model misspecification , 2013, Comput. Stat. Data Anal..

[31]  Fabio Tozeto Ramos,et al.  Bayesian optimisation for Intelligent Environmental Monitoring , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[32]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[33]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[34]  Adam D. Bull,et al.  Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..

[35]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[36]  Nando de Freitas,et al.  Portfolio Allocation for Bayesian Optimization , 2010, UAI.

[37]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[38]  Sham M. Kakade,et al.  Information Consistency of Nonparametric Gaussian Process Methods , 2008, IEEE Transactions on Information Theory.

[39]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[40]  A J Knoek van Soest,et al.  The merits of a parallel genetic algorithm in solving hard optimization problems. , 2003, Journal of biomechanical engineering.

[41]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[42]  Donald R. Jones,et al.  A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..

[43]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[44]  Thomas Bäck,et al.  Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .

[45]  Nicolò Cesa-Bianchi,et al.  Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[46]  M. J. D. Powell,et al.  On search directions for minimization algorithms , 1973, Math. Program..

[47]  G. Kallianpur,et al.  The Bernstein-Von Mises theorem for Markov processes , 1971 .

[48]  Xiaowen Dong,et al.  Interpretable Neural Architecture Search via Bayesian Optimisation with Weisfeiler-Lehman Kernels , 2021, ICLR.

[49]  Roman Garnett,et al.  Automating Bayesian optimization with Bayesian optimization , 2018, NeurIPS.

[50]  Svetha Venkatesh,et al.  Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation , 2018, NeurIPS.

[51]  Cheng Li,et al.  Regret for Expected Improvement over the Best-Observed Value and Stopping Condition , 2017, ACML.

[52]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[53]  Michael A. Osborne,et al.  Gaussian Processes for Global Optimization , 2008 .

[54]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[55]  Tim Menzies,et al.  The \{PROMISE\} Repository of Software Engineering Databases. , 2005 .

[56]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[57]  Jan Paul Siebert,et al.  Vehicle Recognition Using Rule Based Methods , 1987 .

[58]  Tuning Materials-Binding Peptide Sequences toward Gold- and Silver-Binding Selectivity with Bayesian Optimization , 2022 .