A General Framework for User-Guided Bayesian Optimization

The optimization of expensive-to-evaluate black-box functions is prevalent in various scientific disciplines. Bayesian optimization is an automatic, general and sample-efficient method to solve these problems with minimal knowledge of the underlying function dynamics. However, the ability of Bayesian optimization to incorporate prior knowledge or beliefs about the function at hand in order to accelerate the optimization is limited, which reduces its appeal for knowledgeable practitioners with tight budgets. To allow domain experts to customize the optimization routine, we propose ColaBO, the first Bayesian-principled framework for incorporating prior beliefs beyond the typical kernel structure, such as the likely location of the optimizer or the optimal value. The generality of ColaBO makes it applicable across different Monte Carlo acquisition functions and types of user beliefs. We empirically demonstrate ColaBO's ability to substantially accelerate optimization when the prior information is accurate, and to retain approximately default performance when it is misleading.

[1]  E. Bakshy,et al.  Unexpected Improvements to Expected Improvement for Bayesian Optimization , 2023, Neural Information Processing Systems.

[2]  F. Hutter,et al.  PriorBand: Practical Hyperparameter Optimization in the Age of Deep Learning , 2023, NeurIPS.

[3]  José Miguel Hernández-Lobato,et al.  Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent , 2023, Neural Information Processing Systems.

[4]  Samuel G. Müller,et al.  PFNs4BO: In-Context Learning for Bayesian Optimization , 2023, ICML.

[5]  F. Hutter,et al.  Self-Correcting Bayesian Optimization through Bayesian Active Learning , 2023, NeurIPS.

[6]  N. Kantas,et al.  Joint Entropy Search for Multi-objective Bayesian Optimization , 2022, NeurIPS.

[7]  Luigi Nardi,et al.  Learning Skill-based Industrial Robot Tasks with User Priors , 2022, 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE).

[8]  Rob A. Rutenbar,et al.  HPVM2FPGA: Enabling True Hardware-Agnostic FPGA Programming , 2022, 2022 IEEE 33rd International Conference on Application-specific Systems, Architectures and Processors (ASAP).

[9]  F. Hutter,et al.  Joint Entropy Search For Maximally-Informed Bayesian Optimization , 2022, NeurIPS.

[10]  F. Hutter,et al.  πBO: Augmenting Acquisition Functions with User Beliefs for Bayesian Optimization , 2022, ICLR.

[11]  Peter I. Frazier,et al.  Bayesian Optimization of Function Networks , 2021, NeurIPS.

[12]  Sebastian Pineda Arango,et al.  Transformers Can Do Bayesian Inference , 2021, ICLR.

[13]  F. Hutter,et al.  SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization , 2021, J. Mach. Learn. Res..

[14]  George E. Dahl,et al.  Pre-trained Gaussian processes for Bayesian optimization , 2021, 2109.08215.

[15]  Andreas Krause,et al.  Meta-Learning Reliable Priors in the Function Space , 2021, NeurIPS.

[16]  Ke Alexander Wang,et al.  Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information , 2021, ICML.

[17]  Kazuki Shitara,et al.  Sequential- and Parallel- Constrained Max-value Entropy Search via Information Lower Bound , 2021, ICML.

[18]  Paul Rayson,et al.  GIBBON: General-purpose Information-Based Bayesian OptimisatioN , 2021, J. Mach. Learn. Res..

[19]  Josif Grabocka,et al.  Few-Shot Bayesian Optimization with Deep Kernel Surrogates , 2021, ICLR.

[20]  Marius Lindauer,et al.  Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL , 2020, ArXiv.

[21]  Marc Peter Deisenroth,et al.  Efficiently sampling functions from Gaussian process posteriors , 2020, ICML.

[22]  Andreas Krause,et al.  PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees , 2020, ICML.

[23]  Ryan-Rhys Griffiths,et al.  Constrained Bayesian optimization for automatic chemical design using variational autoencoders , 2019, Chemical science.

[24]  Colin White,et al.  BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search , 2019, AAAI.

[25]  Matthias Poloczek,et al.  Scalable Global Optimization via Local Bayesian Optimization , 2019, NeurIPS.

[26]  Matthias Seeger,et al.  Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning , 2019, NeurIPS.

[27]  Michael A. Osborne,et al.  Knowing The What But Not The Where in Bayesian Optimization , 2019, ICML.

[28]  Kunle Olukotun,et al.  Practical Design Space Exploration , 2018, 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).

[29]  Max Welling,et al.  BOCK : Bayesian Optimization with Cylindrical Kernels , 2018, ICML.

[30]  Frank Hutter,et al.  Maximizing acquisition functions for Bayesian optimization , 2018, NeurIPS.

[31]  Kirthevasan Kandasamy,et al.  Parallelised Bayesian Optimisation via Thompson Sampling , 2018, AISTATS.

[32]  Leslie N. Smith,et al.  A disciplined approach to neural network hyper-parameters: Part 1 - learning rate, batch size, momentum, and weight decay , 2018, ArXiv.

[33]  F. Hutter,et al.  Practical Transfer Learning for Bayesian Optimization , 2018, 1802.02219.

[34]  Frank Hutter,et al.  The reparameterization trick for acquisition functions , 2017, ArXiv.

[35]  Guilherme Ottoni,et al.  Constrained Bayesian Optimization with Noisy Experiments , 2017, Bayesian Analysis.

[36]  Zi Wang,et al.  Batched Large-scale Bayesian Optimization in High-dimensional Spaces , 2017, AISTATS.

[37]  Zi Wang,et al.  Max-value Entropy Search for Efficient Bayesian Optimization , 2017, ICML.

[38]  Lars Schmidt-Thieme,et al.  Hyperparameter Search Space Pruning - A New Component for Sequential Model-Based Hyperparameter Optimization , 2015, ECML/PKDD.

[39]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Bayesian Optimization with Unknown Constraints , 2015, ICML.

[40]  Frank Hutter,et al.  Initializing Bayesian Hyperparameter Optimization via Meta-Learning , 2015, AAAI.

[41]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.

[42]  Jan Peters,et al.  Bayesian Gait Optimization for Bipedal Locomotion , 2014, LION.

[43]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[44]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[45]  Ryan P. Adams,et al.  Multi-Task Bayesian Optimization , 2013, NIPS.

[46]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[47]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[48]  Philipp Hennig,et al.  Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..

[49]  Adam D. Bull,et al.  Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..

[50]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[51]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[52]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[53]  Warren B. Powell,et al.  The Knowledge-Gradient Policy for Correlated Normal Beliefs , 2009, INFORMS J. Comput..

[54]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[55]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[56]  Matthias W. Seeger,et al.  Gaussian Processes For Machine Learning , 2004, Int. J. Neural Syst..

[57]  M. Abrahamsen,et al.  Cryptosporidium Parvum Genome Project , 2001, Comparative and functional genomics.

[58]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[59]  Harold J. Kushner,et al.  A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .

[60]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[61]  S. Venkatesh,et al.  Human-AI Collaborative Bayesian Optimisation , 2022, NeurIPS.

[62]  Appendix to: BOTORCH: A Framework for Efficient Monte-Carlo Bayesian Optimization , 2021 .

[63]  Kunle Olukotun,et al.  Bayesian Optimization with a Prior for the Optimum , 2021, ECML/PKDD.

[64]  Heeyoung Kim,et al.  Objective Bound Conditional Gaussian Process for Bayesian Optimization , 2021, ICML.

[65]  Xiaowen Dong,et al.  Interpretable Neural Architecture Search via Bayesian Optimisation with Weisfeiler-Lehman Kernels , 2021, ICLR.

[66]  A.1. Model Definition “Multi-fidelity Bayesian Optimization with Max-value Entropy Search and its Parallelization” , 2020 .

[67]  F. Lutscher Spatial Variation , 2019, Interdisciplinary Applied Mathematics.

[68]  Andreas Krause,et al.  Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features , 2018, NeurIPS.

[69]  Kirthevasan Kandasamy,et al.  Gaussian Process Bandit Optimisation with Multi-fidelity Evaluations , 2016, NIPS.

[70]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.