Doubly Sparse Variational Gaussian Processes

The use of Gaussian process models is typically limited to datasets with a few tens of thousands of observations due to their complexity and memory footprint. The two most commonly used methods to overcome this limitation are 1) the variational sparse approximation which relies on inducing points and 2) the state-space equivalent formulation of Gaussian processes which can be seen as exploiting some sparsity in the precision matrix. We propose to take the best of both worlds: we show that the inducing point framework is still valid for state space models and that it can bring further computational and memory savings. Furthermore, we provide the natural gradient formulation for the proposed variational parameterisation. Finally, this work makes it possible to use the state-space formulation inside deep Gaussian process models as illustrated in one of the experiments.

[1]  Neil D. Lawrence,et al.  Parallelizable sparse inverse formulation Gaussian processes (SpInGP) , 2016, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).

[2]  Carl E. Rasmussen,et al.  Understanding Probabilistic Sparse Gaussian Process Approximations , 2016, NIPS.

[3]  Arno Solin,et al.  Infinite-Horizon Gaussian Processes , 2018, NeurIPS.

[4]  Simo Särkkä,et al.  On convergence and accuracy of state-space approximations of squared exponential covariance functions , 2014, 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[5]  Aníbal R. Figueiras-Vidal,et al.  Inter-domain Gaussian Processes for Sparse Inference using Inducing Features , 2009, NIPS.

[6]  Thomas A. Runkler,et al.  Bayesian Alignments of Warped Multi-Output Gaussian Processes , 2018, NeurIPS.

[7]  Arno Solin,et al.  Explicit Link Between Periodic Covariance Functions and State Space Models , 2014, AISTATS.

[8]  James Hensman,et al.  Banded Matrix Operators for Gaussian Markov Models in the Automatic Differentiation Era , 2019, AISTATS.

[9]  Arno Solin,et al.  Applied Stochastic Differential Equations , 2019 .

[10]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[11]  Marc Peter Deisenroth,et al.  Doubly Stochastic Variational Inference for Deep Gaussian Processes , 2017, NIPS.

[12]  M. Seeger Low Rank Updates for the Cholesky Decomposition , 2004 .

[13]  James Hensman,et al.  Scalable transformed additive signal decomposition by non-conjugate Gaussian process inference , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[14]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[15]  Carl E. Rasmussen,et al.  Convolutional Gaussian Processes , 2017, NIPS.

[16]  Giovanni Pistone,et al.  Information Geometry of the Gaussian Distribution in View of Stochastic Optimization , 2015, FOGA.

[17]  Carl E. Rasmussen,et al.  Rates of Convergence for Sparse Variational Gaussian Process Regression , 2019, ICML.

[18]  Manfred Opper,et al.  The Variational Gaussian Approximation Revisited , 2009, Neural Computation.

[19]  Arno Solin,et al.  Variational Fourier Features for Gaussian Processes , 2016, J. Mach. Learn. Res..

[20]  Arno Solin,et al.  State Space Gaussian Processes with Non-Gaussian Likelihood , 2018, ICML.

[21]  James Hensman,et al.  Natural Gradients in Practice: Non-Conjugate Variational Inference in Gaussian Process Models , 2018, AISTATS.

[22]  พงศ์ศักดิ์ บินสมประสงค์,et al.  FORMATION OF A SPARSE BUS IMPEDANCE MATRIX AND ITS APPLICATION TO SHORT CIRCUIT STUDY , 1980 .

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[25]  Neil D. Lawrence,et al.  Sparse Convolved Gaussian Processes for Multi-output Regression , 2008, NIPS.

[26]  M. Giles Collected Matrix Derivative Results for Forward and Reverse Mode Algorithmic Differentiation , 2008 .

[27]  James Hensman,et al.  On Sparse Variational Methods and the Kullback-Leibler Divergence between Stochastic Processes , 2015, AISTATS.

[28]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[29]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[30]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[31]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[32]  Ieva Kazlauskaite,et al.  Compositional uncertainty in deep Gaussian processes , 2020, UAI.