Beyond Intuition, a Framework for Applying GPs to Real-World Data

Gaussian Processes (GPs) offer an attractive method for regression over small, structured and correlated datasets. However, their deployment is hindered by computational costs and limited guidelines on how to apply GPs beyond simple low-dimensional datasets. We propose a framework to identify the suitability of GPs to a given problem and how to set up a robust and well-specified GP model. The guidelines formalise the decisions of experienced GP practitioners, with an emphasis on kernel design and options for computational scalability. The framework is then applied to a case study of glacier elevation change yielding more accurate results at test time.

[1]  Zong‐Liang Yang,et al.  A graph neural network (GNN) approach to basin-scale river network learning: the role of physics-based connectivity and data fusion , 2022, Hydrology and Earth System Sciences.

[2]  V. Lalchand,et al.  Kernel Learning for Explainable Climate Science , 2022, ArXiv.

[3]  J. Hensman,et al.  Additive Gaussian Processes Revisited , 2022, ICML.

[4]  Samuel J Bell,et al.  Modeling the Machine Learning Multiverse , 2022, NeurIPS.

[5]  J. Cunningham,et al.  Variational Nearest Neighbor Gaussian Processes , 2022, ICML.

[6]  M. Binois,et al.  A Survey on High-dimensional Gaussian Process Modeling with Application to Bayesian Optimization , 2021, ACM Trans. Evol. Learn. Optim..

[7]  Richard E. Turner,et al.  Efficient Gaussian Neural Processes for Regression , 2021, ArXiv.

[8]  Alessandro Vullo,et al.  Kernel Identification Through Transformers , 2021, NeurIPS.

[9]  Carl E. Rasmussen,et al.  The Promises and Pitfalls of Deep Kernel Learning , 2021, UAI.

[10]  James Hensman,et al.  A Framework for Interdomain and Multioutput Gaussian Processes , 2020, ArXiv.

[11]  Stephen Tyree,et al.  Exact Gaussian Processes on a Million Data Points , 2019, NeurIPS.

[12]  Bin Yu Veridical data science , 2019, Proceedings of the National Academy of Sciences.

[13]  Luca Saglietti,et al.  Gaussian Process Prior Variational Autoencoders , 2018, NeurIPS.

[14]  Andrew Gordon Wilson,et al.  GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration , 2018, NeurIPS.

[15]  Haitao Liu,et al.  When Gaussian Process Meets Big Data: A Review of Scalable GPs , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Arthur Gretton,et al.  BRUNO: A Deep Recurrent Model for Exchangeable Data , 2018, NeurIPS.

[17]  Carl E. Rasmussen,et al.  Convolutional Gaussian Processes , 2017, NIPS.

[18]  Richard E. Turner,et al.  Streaming Sparse Gaussian Process Approximations , 2017, NIPS.

[19]  Seth R Flaxman,et al.  Improved prediction accuracy for disease risk mapping using Gaussian process stacked generalization , 2016, Journal of The Royal Society Interface.

[20]  Daniel McNeish,et al.  On Using Bayesian Methods to Address Small Sample Problems , 2016 .

[21]  Andrew Gordon Wilson,et al.  Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP) , 2015, ICML.

[22]  Marc Peter Deisenroth,et al.  Distributed Gaussian Processes , 2015, ICML.

[23]  Jenný Brynjarsdóttir,et al.  Learning about physical parameters: the importance of model discrepancy , 2014 .

[24]  Joshua B. Tenenbaum,et al.  Automatic Construction and Natural-Language Description of Nonparametric Regression Models , 2014, AAAI.

[25]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[26]  Jouni Hartikainen,et al.  Kalman filtering and smoothing solutions to temporal Gaussian process regression models , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[27]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[28]  Gavin C. Cawley,et al.  Preventing Over-Fitting during Model Selection via Bayesian Regularisation of the Hyper-Parameters , 2007, J. Mach. Learn. Res..

[29]  Shie Mannor,et al.  Reinforcement learning with Gaussian processes , 2005, ICML.

[30]  Peter Sollich,et al.  Can Gaussian Process Regression Be Made Robust Against Model Mismatch? , 2004, Deterministic and Statistical Methods in Machine Learning.

[31]  Peter Sollich Gaussian Process Regression with Mismatched Models , 2001, NIPS.

[32]  Volker Tresp,et al.  A Bayesian Committee Machine , 2000, Neural Computation.

[33]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[34]  Ryan P. Adams,et al.  Task-Agnostic Amortized Inference of Gaussian Process Hyperparameters , 2020, NeurIPS.

[35]  Zoubin Ghahramani,et al.  The Automatic Statistician , 2019, Automated Machine Learning.

[36]  A. V. Vecchia Estimation and model identification for continuous spatial processes , 1988 .