论文信息 - A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments---active user modelling with preferences, and hierarchical reinforcement learning---and a discussion of the pros and cons of Bayesian optimization based on our experiences.

Nando de Freitas | Eric Brochu | Vlad M. Cora | N. D. Freitas | E. Brochu

[1] F. Mosteller. Remarks on the method of paired comparisons: I. The least squares solution assuming equal standard deviations and equal correlations , 1951 .

[2] D. Krige. A statistical approach to some basic mine valuation problems on the Witwatersrand, by D.G. Krige, published in the Journal, December 1951 : introduction by the author , 1951 .

[3] F. Mosteller,et al. Remarks on the method of paired comparisons: III. A test of significance for paired comparisons when equal standard deviations and equal correlations are assumed , 1951, Psychometrika.

[4] S. Siegel,et al. Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[5] Harold J. Kushner,et al. A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .

[6] Carlos S. Kubrusly,et al. Stochastic approximation algorithms and applications , 1973, CDC 1973.

[7] A. Tversky,et al. Prospect Theory. An Analysis of Decision Making Under Risk , 1977 .

[8] D. McFadden. Econometric Models for Probabilistic Choice Among Products , 1980 .

[9] R. Forthofer,et al. Rank Correlation Methods , 1981 .

[10] Bruce E. Stuckman,et al. A global search method for optimizing nonlinear systems , 1988, IEEE Trans. Syst. Man Cybern..

[11] J. E. Glynn,et al. Numerical Recipes: The Art of Scientific Computing , 1989 .

[12] J. Mockus,et al. The Bayesian approach to global optimization , 1989 .

[13] Bruno Betrò,et al. Bayesian methods in global optimization , 1991, J. Glob. Optim..

[14] J. Aplevich,et al. Lecture Notes in Control and Information Sciences , 1979 .

[15] J. Elder. Global R/sup d/ optimization when probes are expensive: the GROPE algorithm , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.

[16] A. Tversky,et al. Advances in prospect theory: Cumulative representation of uncertainty , 1992 .

[17] David J. C. MacKay,et al. A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[18] D. Dennis,et al. A statistical method for global optimization , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.

[19] Eric J. Johnson,et al. The adaptive decision maker , 1993 .

[20] C. D. Perttunen,et al. Lipschitzian optimization without the Lipschitz constant , 1993 .

[21] William A. Gale,et al. A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[22] Jonas Mockus,et al. Application of Bayesian approach to numerical methods of global and stochastic optimization , 1994, J. Glob. Optim..

[23] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[24] D. Dennis,et al. SDO : A Statistical Method for Global Optimization , 1997 .

[25] Paul W. Goldberg,et al. Regression with Input-dependent Noise: A Gaussian Process Treatment , 1997, NIPS.

[26] Marco Locatelli,et al. Bayesian Algorithms for One-Dimensional Global Optimization , 1997, J. Glob. Optim..

[27] William J. Welch,et al. Computer experiments and global optimization , 1997 .

[28] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .

[29] Donald R. Jones,et al. Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[30] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..