论文信息 - Optimal Order Simple Regret for Gaussian Process Bandits

Optimal Order Simple Regret for Gaussian Process Bandits

Consider the sequential optimization of a continuous, possibly non-convex, and expensive to evaluate objective function f . The problem can be cast as a Gaussian Process (GP) bandit where f lives in a reproducing kernel Hilbert space (RKHS). The state of the art analysis of several learning algorithms shows a significant gap between the lower and upper bounds on the simple regret performance. When N is the number of exploration trials and γN is the maximal information gain, we prove an Õ( √ γN/N) bound on the simple regret performance of a pure exploration algorithm that is significantly tighter than the existing bounds. We show that this bound is order optimal up to logarithmic factors for the cases where a lower bound on regret is known. To establish these results, we prove novel and sharp confidence intervals for GP models applicable to RKHS elements which may be of broader interest.

[1] Ali Jalali,et al. Hybrid Batch Bayesian Optimization , 2012, ICML.

[2] Tara Javidi,et al. Open Problem: Tight Online Confidence Intervals for RKHS Elements , 2021, COLT.

[3] Leslie Pack Kaelbling,et al. Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior , 2018, NeurIPS.

[4] Michal Valko,et al. Simple regret for infinitely many armed bandits , 2015, ICML.

[5] Nando de Freitas,et al. Portfolio Allocation for Bayesian Optimization , 2010, UAI.

[6] Patrick Chareka,et al. LOCALLY SUB-GAUSSIAN RANDOM VARIABLES AND THE STRONG LAW OF LARGE NUMBERS , 2006 .

[7] Quanquan Gu,et al. Neural Contextual Bandits with UCB-based Exploration , 2019, ICML.

[8] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[9] Ness B. Shroff,et al. Multi-Armed Bandits with Local Differential Privacy , 2020, ArXiv.

[10] Aditya Gopalan,et al. On Kernelized Multi-armed Bandits , 2017, ICML.

[11] Neil D. Lawrence,et al. Gaussian Processes for Big Data , 2013, UAI.

[12] Tara Javidi,et al. Multi-Scale Zero-Order Optimization of Smooth Functions in an RKHS , 2020, ArXiv.

[13] Nicolò Cesa-Bianchi,et al. Bandits With Heavy Tail , 2012, IEEE Transactions on Information Theory.

[14] R. G. Antonini,et al. Convergence of series of dependent φ-subgaussian random variables , 2008 .

[15] Christos Dimitrakakis,et al. Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost? , 2019, ArXiv.

[16] Qing Zhao,et al. Multi-Armed Bandits: Theory and Applications to Online Learning in Networks , 2019, Multi-Armed Bandits.

[17] Leslie Pack Kaelbling,et al. Bayesian Optimization with Exponential Convergence , 2015, NIPS.

[18] Tara Javidi,et al. Significance of Gradient Information in Bayesian Optimization , 2021, AISTATS.

[19] Alexander J. Smola,et al. Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations , 2012, ICML.

[20] David Janz,et al. Bandit optimisation of functions in the Matérn kernel RKHS , 2020, AISTATS.

[21] Jonathan Scarlett,et al. On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization , 2021, ICML.

[22] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[23] Dino Sejdinovic,et al. Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences , 2018, ArXiv.

[24] Kirthevasan Kandasamy,et al. Parallelised Bayesian Optimisation via Thompson Sampling , 2018, AISTATS.

[25] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[26] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[27] John Shawe-Taylor,et al. Regret Bounds for Gaussian Process Bandit Problems , 2010, AISTATS 2010.

[28] Nello Cristianini,et al. Finite-Time Analysis of Kernelised Contextual Bandits , 2013, UAI.

[29] Aaron Klein,et al. BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[30] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[31] Michael I. Jordan,et al. On Thompson Sampling with Langevin Algorithms , 2020, ICML 2020.

[32] Andreas Krause,et al. Truncated Variance Reduction: A Unified Approach to Bayesian Optimization and Level-Set Estimation , 2016, NIPS.

[33] Tara Javidi,et al. Gaussian Process bandits with adaptive discretization , 2017, ArXiv.

[34] R. Agrawal. The Continuum-Armed Bandit Problem , 1995 .

[35] Robert D. Kleinberg. Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.

[36] Zi Wang,et al. Max-value Entropy Search for Efficient Bayesian Optimization , 2017, ICML.

[37] D. Ginsbourger,et al. A benchmark of kriging-based infill criteria for noisy optimization , 2013, Structural and Multidisciplinary Optimization.

[38] Andreas Krause,et al. Contextual Gaussian Process Bandit Optimization , 2011, NIPS.

[39] Adam D. Bull,et al. Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..

[40] Michalis K. Titsias,et al. Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[41] Sattar Vakili,et al. A Random Walk Approach to First-Order Stochastic Convex Optimization , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[42] P. Frazier. Bayesian Optimization , 2018, Hyperparameter Optimization in Machine Learning.

[43] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .

[44] Richard Combes,et al. Unimodal Bandits with Continuous Arms: Order-optimal Regret without Smoothness , 2020, Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems.

[45] Csaba Szepesvári,et al. –armed Bandits , 2022 .

[46] Andreas Krause,et al. Corruption-Tolerant Gaussian Process Bandit Optimization , 2020, AISTATS.

[47] Sattar Vakili,et al. Ordinal Bayesian Optimisation , 2019, ArXiv.

[48] Sham M. Kakade,et al. Stochastic Convex Optimization with Bandit Feedback , 2011, SIAM J. Optim..

[49] Jonathan Scarlett,et al. Tight Regret Bounds for Bayesian Optimization in One Dimension , 2018, ICML.

[50] Stefano Ermon,et al. Best arm identification in multi-armed bandits with delayed feedback , 2018, AISTATS.

[51] Nando de Freitas,et al. Theoretical Analysis of Bayesian Optimisation with Unknown Gaussian Process Hyper-Parameters , 2014, ArXiv.

[52] Kai Zheng,et al. Locally Differentially Private (Contextual) Bandits Learning , 2020, NeurIPS.

[53] Robert T. McGibbon,et al. Osprey: Hyperparameter Optimization for Machine Learning , 2016, J. Open Source Softw..

[54] Aleksandrs Slivkins,et al. Introduction to Multi-Armed Bandits , 2019, Found. Trends Mach. Learn..

[55] Kirthevasan Kandasamy,et al. Multi-fidelity Gaussian Process Bandit Optimisation , 2016, J. Artif. Intell. Res..

[56] Joel W. Burdick,et al. Stagewise Safe Bayesian Optimization with Gaussian Processes , 2018, ICML.