Probabilistic selection of inducing points in sparse Gaussian processes

Sparse Gaussian processes and various extensions thereof are enabled through inducing points, that simultaneously bottleneck the predictive capacity and act as the main contributor towards model complexity. However, the number of inducing points is generally not associated with uncertainty which prevents us from applying the apparatus of Bayesian reasoning in identifying an appropriate trade-off. In this work we place a point process prior on the inducing points and approximate the associated posterior through stochastic variational inference. By letting the prior encourage a moderate number of inducing points, we enable the model to learn which and how many points to utilise. We experimentally show that fewer inducing points are preferred by the model as the points become less informative, and further demonstrate how the method can be applied in deep Gaussian processes and latent variable modelling.

[1]  Neil D. Lawrence,et al.  Bayesian Gaussian Process Latent Variable Model , 2010, AISTATS.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[4]  Neil D. Lawrence,et al.  Nested Variational Compression in Deep Gaussian Processes , 2014, 1412.1370.

[5]  Theodoros Damoulas,et al.  Multi-resolution Multi-task Gaussian Processes , 2019, NeurIPS.

[6]  Carl E. Rasmussen,et al.  Rates of Convergence for Sparse Variational Gaussian Process Regression , 2019, ICML.

[7]  Jos'e Miguel Hern'andez-Lobato,et al.  Deep Gaussian Processes with Decoupled Inducing Inputs , 2018 .

[8]  Sébastien Ourselin,et al.  Efficient Gaussian Process-Based Modelling and Prediction of Image Time Series , 2015, IPMI.

[9]  Marc Peter Deisenroth,et al.  Doubly Stochastic Variational Inference for Deep Gaussian Processes , 2017, NIPS.

[10]  M. Opper Sparse Online Gaussian Processes , 2008 .

[11]  Paul Glasserman,et al.  Monte Carlo Methods in Financial Engineering , 2003 .

[12]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[13]  James Hensman,et al.  On Sparse Variational Methods and the Kullback-Leibler Divergence between Stochastic Processes , 2015, AISTATS.

[14]  Richard E. Turner,et al.  Streaming Sparse Gaussian Process Approximations , 2017, NIPS.

[15]  James Hensman,et al.  MCMC for Variationally Sparse Gaussian Processes , 2015, NIPS.

[16]  Byron Boots,et al.  Orthogonally Decoupled Variational Gaussian Processes , 2018, NeurIPS.

[17]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[18]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[19]  James Hensman,et al.  Scalable Variational Gaussian Process Classification , 2014, AISTATS.

[20]  Arno Solin,et al.  Variational Fourier Features for Gaussian Processes , 2016, J. Mach. Learn. Res..

[21]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[22]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[23]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[24]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[25]  Carl E. Rasmussen,et al.  Sparse Spectrum Gaussian Process Regression , 2010, J. Mach. Learn. Res..

[26]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[27]  Lehel Csató,et al.  Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[28]  Jonas Mockus,et al.  On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[29]  Samuel Kaski,et al.  Deep convolutional Gaussian processes , 2018, ECML/PKDD.

[30]  Alexis Boukouvalas,et al.  GrandPrix: scaling up the Bayesian GPLVM for single-cell data , 2017, bioRxiv.

[31]  Ryan P. Adams,et al.  Avoiding pathologies in very deep networks , 2014, AISTATS.

[32]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[33]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[34]  Andrew Gordon Wilson,et al.  GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration , 2018, NeurIPS.

[35]  Roy L. Streit,et al.  Poisson Point Processes: Imaging, Tracking, and Sensing , 2010 .

[36]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[37]  Ariel D. Procaccia,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[38]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[39]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[40]  Maurizio Filippone,et al.  Sparse Gaussian Processes Revisited: Bayesian Approaches to Inducing-Variable Approximations , 2020, AISTATS.

[41]  Aníbal R. Figueiras-Vidal,et al.  Inter-domain Gaussian Processes for Sparse Inference using Inducing Features , 2009, NIPS.