Multi-fidelity Gaussian Process Bandit Optimisation

In many scientific and engineering applications, we are tasked with the maximisation of an expensive to evaluate black box function f. Traditional settings for this problem assume just the availability of this single function. However, in many cases, cheap approximations to f may be obtainable. For example, the expensive real world behaviour of a robot can be approximated by a cheap computer simulation. We can use these approximations to eliminate low function value regions cheaply and use the expensive evaluations of f in a small but promising region and speedily identify the optimum. We formalise this task as a multi-fidelity bandit problem where the target function and its approximations are sampled from a Gaussian process. We develop MF-GP-UCB, a novel method based on upper confidence bound techniques. In our theoretical analysis we demonstrate that it exhibits precisely the above behaviour and achieves better bounds on the regret than strategies which ignore multi-fidelity information. Empirically, MF-GP-UCB outperforms such naive strategies and other multi-fidelity methods on several synthetic and real experiments.

[1]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[2]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[3]  R. Adler An introduction to continuity, extrema, and related topics for general Gaussian processes , 1990 .

[4]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[5]  C. D. Perttunen,et al.  Lipschitzian optimization without the Lipschitz constant , 1993 .

[6]  Jonas Mockus,et al.  Application of Bayesian approach to numerical methods of global and stochastic optimization , 1994, J. Glob. Optim..

[7]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[8]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[9]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[10]  A. O'Hagan,et al.  Predicting the output from a complex computer code when fast approximations are available , 2000 .

[11]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[12]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[13]  Pieter Abbeel,et al.  Using inaccurate models in reinforcement learning , 2006, ICML.

[14]  S. Ghosal,et al.  Posterior consistency of Gaussian process prior for nonparametric binary regression , 2006, math/0702686.

[15]  D. Parkinson,et al.  Bayesian model selection analysis of WMAP3 , 2006, astro-ph/0605003.

[16]  R. A. Miller,et al.  Sequential kriging optimization using multiple-fidelity evaluations , 2006 .

[17]  Gregory S. Hornby,et al.  Automated Antenna Design with Evolutionary Algorithms , 2006 .

[18]  Alexander I. J. Forrester,et al.  Multi-fidelity optimization via surrogate modelling , 2007, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[19]  Nando de Freitas,et al.  Active Policy Learning for Robot Planning and Exploration under Uncertainty , 2007, Robotics: Science and Systems.

[20]  W. M. Wood-Vasey,et al.  Scrutinizing Exotic Cosmological Models Using ESSENCE Supernova Data Combined with Other Cosmological Probes , 2007, astro-ph/0701510.

[21]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[22]  Tao Wang,et al.  Automatic Gait Optimization with Gaussian Process Regression , 2007, IJCAI.

[23]  Filip Radlinski,et al.  Mortal Multi-Armed Bandits , 2008, NIPS.

[24]  Thomas P. Hayes,et al.  Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.

[25]  Ilan Kroo,et al.  A Multifidelity Gradient-Free Optimization Method and Application to Aerodynamic Design , 2008 .

[26]  Sham M. Kakade,et al.  Information Consistency of Nonparametric Gaussian Process Methods , 2008, IEEE Transactions on Information Theory.

[27]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[28]  Csaba Szepesvári,et al.  Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..

[29]  Nathan R. Sturtevant,et al.  Learning when to stop thinking and do something! , 2009, ICML '09.

[30]  William Whittaker,et al.  Autonomous driving in urban environments: Boss and the Urban Challenge , 2008, J. Field Robotics.

[31]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[32]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[33]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[34]  Csaba Szepesvári,et al.  –armed Bandits , 2022 .

[35]  Rémi Munos,et al.  Optimistic Optimization of Deterministic Functions , 2011, NIPS 2011.

[36]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[37]  Adam D. Bull,et al.  Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..

[38]  Peter L. Bartlett,et al.  Oracle inequalities for computationally budgeted model selection , 2011, COLT.

[39]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[40]  Alexander J. Smola,et al.  Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations , 2012, ICML.

[41]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[42]  Shifeng Xiong,et al.  Sequential Design and Analysis of High-Accuracy and Low-Accuracy Computer Codes , 2013, Technometrics.

[43]  Jasper Snoek,et al.  Multi-Task Bayesian Optimization , 2013, NIPS.

[44]  Andreas Krause,et al.  High-Dimensional Gaussian Process Bandits , 2013, NIPS.

[45]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.

[46]  Loo Hay Lee,et al.  Efficient multi-fidelity simulation optimization , 2014, Proceedings of the Winter Simulation Conference 2014.

[47]  Jasper Snoek,et al.  Freeze-Thaw Bayesian Optimization , 2014, ArXiv.

[48]  Jonathan P. How,et al.  Reinforcement learning with multi-fidelity simulators , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[49]  F. Hutter,et al.  Towards efficient Bayesian Optimization for Big Data , 2015 .

[50]  Kirthevasan Kandasamy,et al.  High Dimensional Bayesian Optimisation and Bandits via Additive Models , 2015, ICML.

[51]  Neil D. Lawrence,et al.  Bayesian Optimization for Synthetic Gene Design , 2015, 1505.01627.

[52]  Kamalika Chaudhuri,et al.  Active Learning from Weak and Strong Labelers , 2015, NIPS.

[53]  Leslie Pack Kaelbling,et al.  Bayesian Optimization with Exponential Convergence , 2015, NIPS.

[54]  Kirthevasan Kandasamy,et al.  Gaussian Process Bandit Optimisation with Multi-fidelity Evaluations , 2016, NIPS.

[55]  Shuai Li,et al.  Online Optimization Methods for the Quantification Problem , 2016, KDD.

[56]  Yaoliang Yu,et al.  Additive Approximations in High Dimensional Nonparametric Regression via the SALSA , 2016, ICML.

[57]  Kirthevasan Kandasamy,et al.  The Multi-fidelity Multi-armed Bandit , 2016, NIPS.

[58]  Shuai Li,et al.  Collaborative Filtering Bandits , 2015, SIGIR.

[59]  Shuai Li The art of clustering bandits , 2016 .

[60]  Shuai Li,et al.  Distributed Clustering of Linear Bandits in Peer to Peer Networks , 2016, ICML.

[61]  Andreas Krause,et al.  Truncated Variance Reduction: A Unified Approach to Bayesian Optimization and Level-Set Estimation , 2016, NIPS.

[62]  Gerald Tesauro,et al.  Selecting Near-Optimal Learners via Incremental Data Allocation , 2015, AAAI.

[63]  Shuai Li,et al.  On Context-Dependent Clustering of Bandits , 2016, ICML.

[64]  Kirthevasan Kandasamy,et al.  Multi-fidelity Bayesian Optimisation with Continuous Approximations , 2017, ICML.

[65]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[66]  Janardhan Rao Doppa,et al.  Bayesian Optimization Meets Search Based Optimization: A Hybrid Approach for Multi-Fidelity Optimization , 2018, AAAI.

[67]  Michal Valko,et al.  Adaptive black-box optimization got easier: HCT only needs local smoothness , 2018, EWRL 2018.

[68]  Kirthevasan Kandasamy,et al.  Multi-Fidelity Black-Box Optimization with Hierarchical Partitions , 2018, ICML.

[69]  Andrew Gordon Wilson,et al.  Practical Multi-fidelity Bayesian Optimization for Hyperparameter Tuning , 2019, UAI.

[70]  Yisong Yue,et al.  A General Framework for Multi-fidelity Bayesian Optimization with Gaussian Processes , 2018, AISTATS.

[71]  Kirthevasan Kandasamy,et al.  Noisy Blackbox Optimization with Multi-Fidelity Queries: A Tree Search Approach , 2018, AISTATS.

[72]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .