A tutorial on Gaussian process regression: Modelling, exploring, and exploiting functions

This tutorial introduces the reader to Gaussian process regression as an expressive tool to model, actively explore and exploit unknown functions. Gaussian process regression is a powerful, non-parametric Bayesian approach towards regression problems that can be utilized in exploration and exploitation scenarios. This tutorial aims to provide an accessible introduction to these techniques. We will introduce Gaussian processes which generate distributions over functions used for Bayesian non-parametric regression, and demonstrate their use in applications and didactic examples including simple regression problems, a demonstration of kernel-encoded prior assumptions and compositions, a pure exploration scenario within an optimal design framework, and a bandit-like exploration-exploitation scenario where the goal is to recommend movies. Beyond that, we describe a situation modelling risk-averse exploration in which an additional constraint (not to sample below a certain threshold) needs to be accounted for. Lastly, we summarize recent psychological experiments utilizing Gaussian processes. Software and literature pointers are also provided.

[1]  George Kachergis,et al.  Gaussian Process Regression for Trajectory Analysis , 2012, CogSci.

[2]  H. Akaike A new look at the statistical model identification , 1974 .

[3]  Andrew Gordon Wilson,et al.  Gaussian Process Kernels for Pattern Discovery and Extrapolation , 2013, ICML.

[4]  Alexander J. Smola,et al.  Regret Bounds for Deterministic Gaussian Process Bandits , 2012, ArXiv.

[5]  Andreas Krause,et al.  SFO: A Toolbox for Submodular Function Optimization , 2010, J. Mach. Learn. Res..

[6]  Jay I. Myung,et al.  Optimal experimental design for model discrimination. , 2009, Psychological review.

[7]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[8]  Jay I. Myung,et al.  A Tutorial on Adaptive Design Optimization. , 2013, Journal of mathematical psychology.

[9]  Jonas Mockus,et al.  On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[10]  Jonathan D. Nelson,et al.  Exploration and generalization in vast spaces 1 , 2017 .

[11]  E. Wagenmakers,et al.  Bayesian parameter estimation in the Expectancy Valence model of the Iowa gambling task , 2010 .

[12]  Bernhard Schölkopf,et al.  A tutorial on kernel methods for categorization , 2007, Journal of Mathematical Psychology.

[13]  Joel W. Burdick,et al.  An Active Learning Algorithm for Control of Epidural Electrostimulation , 2015, IEEE Transactions on Biomedical Engineering.

[14]  Carl E. Rasmussen,et al.  Gaussian Processes for Machine Learning (GPML) Toolbox , 2010, J. Mach. Learn. Res..

[15]  David S. Leslie,et al.  Optimistic Bayesian Sampling in Contextual-Bandit Problems , 2012, J. Mach. Learn. Res..

[16]  Felix Henninger,et al.  Mousetrap: An integrated, open-source mouse-tracking package , 2017, Behavior Research Methods.

[17]  Joshua B. Tenenbaum,et al.  Probing the Compositionality of Intuitive Functions , 2016, NIPS.

[18]  Andrew Gordon Wilson,et al.  The Human Kernel , 2015, NIPS.

[19]  Daniel W. Apley,et al.  Local Gaussian Process Approximation for Large Computer Experiments , 2013, 1303.0383.

[20]  Robert B. Gramacy,et al.  tgp: An R Package for Bayesian Nonstationary, Semiparametric Nonlinear Regression and Design by Treed Gaussian Process Models , 2007 .

[21]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[22]  Joshua B. Tenenbaum,et al.  Assessing the Perceived Predictability of Functions , 2015, CogSci.

[23]  M. Kac,et al.  An Explicit Representation of a Stationary Gaussian Process , 1947 .

[24]  Philipp Hennig,et al.  Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..

[25]  Jonathan B Freeman,et al.  MouseTracker: Software for studying real-time mental processing using a computer mouse-tracking method , 2010, Behavior research methods.

[26]  Robert B. Gramacy,et al.  Ja n 20 08 Bayesian Treed Gaussian Process Models with an Application to Computer Modeling , 2009 .

[27]  Samuel J. Gershman,et al.  Structured Representations of Utility in Combinatorial Domains , 2017 .

[28]  Michael N. Katehakis,et al.  The Multi-Armed Bandit Problem: Decomposition and Computation , 1987, Math. Oper. Res..

[29]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[30]  Andreas Krause,et al.  Submodular Function Maximization , 2014, Tractability.

[31]  Alkis Gotovos,et al.  Safe Exploration for Optimization with Gaussian Processes , 2015, ICML.

[32]  Michael A. Osborne,et al.  Probabilistic numerics and uncertainty in computations , 2015, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[33]  R. Simon,et al.  Flexible regression models with cubic splines. , 1989, Statistics in medicine.

[34]  Aki Vehtari,et al.  GPstuff: Bayesian modeling with Gaussian processes , 2013, J. Mach. Learn. Res..

[35]  Samuel J. Gershman,et al.  A Tutorial on Bayesian Nonparametric Models , 2011, 1106.2697.

[36]  Jonathan D. Nelson,et al.  Information search with situation-specific reward functions , 2012, Judgment and Decision Making.

[37]  Alexis Boukouvalas,et al.  GPflow: A Gaussian Process Library using TensorFlow , 2016, J. Mach. Learn. Res..

[38]  M. Speekenbrink,et al.  Putting bandits into context: How function learning supports decision making , 2016, bioRxiv.

[39]  Ali Borji,et al.  Bayesian optimization explains human active search , 2013, NIPS.

[40]  C.H. Lee A phase space spline smoother for fitting trajectories , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[41]  Joshua B. Tenenbaum,et al.  Automatic Construction and Natural-Language Description of Nonparametric Regression Models , 2014, AAAI.

[42]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[43]  Jay I. Myung,et al.  On the functional form of temporal discounting: An optimized adaptive test , 2016, Journal of risk and uncertainty.