Scalable Training of Inference Networks for Gaussian-Process Models

Inference in Gaussian process (GP) models is computationally challenging for large data, and often difficult to approximate with a small number of inducing points. We explore an alternative approximation that employs stochastic inference networks for a flexible inference. Unfortunately, for such networks, minibatch training is difficult to be able to learn meaningful correlations over function outputs for a large dataset. We propose an algorithm that enables such training by tracking a stochastic, functional mirror-descent algorithm. At each iteration, this only requires considering a finite number of input locations, resulting in a scalable and easy-to-implement algorithm. Empirical results show comparable and, sometimes, superior performance to existing sparse variational GP methods.

[1]  Dustin Tran,et al.  Reliable Uncertainty Estimates in Deep Neural Networks using Noise Contrastive Priors , 2018, ArXiv.

[2]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[3]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[4]  A. P. Dawid,et al.  Regression and Classification Using Gaussian Process Priors , 2009 .

[5]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[6]  Alexis Boukouvalas,et al.  GPflow: A Gaussian Process Library using TensorFlow , 2016, J. Mach. Learn. Res..

[7]  Maurizio Filippone,et al.  Random Feature Expansions for Deep Gaussian Processes , 2016, ICML.

[8]  José Miguel Hernández-Lobato,et al.  Variational Implicit Processes , 2018, ICML.

[9]  Laurence Aitchison,et al.  Deep Convolutional Networks as shallow Gaussian Processes , 2018, ICLR.

[10]  Nando de Freitas,et al.  An Introduction to Sequential Monte Carlo Methods , 2001, Sequential Monte Carlo Methods in Practice.

[11]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[12]  Andrew Gordon Wilson,et al.  Deep Kernel Learning , 2015, AISTATS.

[13]  Kurt Cutajar Practical learning of deep gaussian processes via random Fourier features , 2016 .

[14]  Richard E. Turner,et al.  Improving the Gaussian Process Sparse Spectrum Approximation by Representing Uncertainty in Frequency Inputs , 2015, ICML.

[15]  Richard E. Turner,et al.  Gaussian Process Behaviour in Wide Deep Neural Networks , 2018, ICLR.

[16]  Richard E. Turner,et al.  Streaming Sparse Gaussian Process Approximations , 2017, NIPS.

[17]  Daniel Flam-Shepherd Mapping Gaussian Process Priors to Bayesian Neural Networks , 2017 .

[18]  Jaehoon Lee,et al.  Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes , 2018, ICLR.

[19]  Guodong Zhang,et al.  Differentiable Compositional Kernel Learning for Gaussian Processes , 2018, ICML.

[20]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[21]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[22]  Carl E. Rasmussen,et al.  Understanding Probabilistic Sparse Gaussian Process Approximations , 2016, NIPS.

[23]  Masashi Sugiyama,et al.  Bayesian Dark Knowledge , 2015 .

[24]  Byron Boots,et al.  Incremental Variational Sparse Gaussian Process Regression , 2016, NIPS.

[25]  David Barber,et al.  Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Jaehoon Lee,et al.  Deep Neural Networks as Gaussian Processes , 2017, ICLR.

[27]  Marc Peter Deisenroth,et al.  Doubly Stochastic Variational Inference for Deep Gaussian Processes , 2017, NIPS.

[28]  Aasa Feragen,et al.  Learning from uncertain curves: The 2-Wasserstein metric for Gaussian processes , 2017, NIPS.

[29]  Jun Zhu,et al.  ZhuSuan: A Library for Bayesian Deep Learning , 2017, ArXiv.

[30]  Arthur Jacot,et al.  Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.

[31]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[32]  Dino Sejdinovic,et al.  Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences , 2018, ArXiv.

[33]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[34]  Bo Zhang,et al.  Function Space Particle Optimization for Bayesian Neural Networks , 2019, ICLR.

[35]  Nathaniel Eldredge,et al.  Analysis and Probability on Infinite-Dimensional Spaces , 2016, 1607.03591.

[36]  Le Song,et al.  Provable Bayesian Inference via Particle Mirror Descent , 2015, AISTATS.

[37]  Radford M. Neal Regression and Classification Using Gaussian Process Priors , 2009 .

[38]  Maurizio Filippone,et al.  AutoGP: Exploring the Capabilities and Limitations of Gaussian Process Models , 2016, UAI.

[39]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[40]  Yee Whye Teh,et al.  Conditional Neural Processes , 2018, ICML.

[41]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[42]  Byron Boots,et al.  Variational Inference for Gaussian Process Models with Linear Complexity , 2017, NIPS.

[43]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[44]  David J. C. MacKay,et al.  BAYESIAN NON-LINEAR MODELING FOR THE PREDICTION COMPETITION , 1996 .

[45]  Guodong Zhang,et al.  Functional Variational Bayesian Neural Networks , 2019, ICLR.

[46]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[47]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[48]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[49]  Mohammad Emtiyaz Khan,et al.  Conjugate-Computation Variational Inference: Converting Variational Inference in Non-Conjugate Models to Inferences in Conjugate Models , 2017, AISTATS.

[50]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[51]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[52]  Carl E. Rasmussen,et al.  Sparse Spectrum Gaussian Process Regression , 2010, J. Mach. Learn. Res..

[53]  Neil D. Lawrence,et al.  Variational Inference for Latent Variables and Uncertain Inputs in Gaussian Processes , 2016, J. Mach. Learn. Res..

[54]  Carl E. Rasmussen,et al.  Convolutional Gaussian Processes , 2017, NIPS.

[55]  Arno Solin,et al.  Variational Fourier Features for Gaussian Processes , 2016, J. Mach. Learn. Res..

[56]  Vivek Rathod,et al.  Bayesian dark knowledge , 2015, NIPS.

[57]  Sayan Mukherjee,et al.  The Information Geometry of Mirror Descent , 2013, IEEE Transactions on Information Theory.

[58]  Jun Zhu,et al.  A Spectral Approach to Gradient Estimation for Implicit Distributions , 2018, ICML.