Sequential Inference for Deep Gaussian Process

A deep Gaussian process (DGP) is a deep network in which each layer is modelled with a Gaussian process (GP). It is a flexible model that can capture highly-nonlinear functions for complex data sets. However, the network structure of DGP often makes inference computationally expensive. In this paper, we propose an efficient sequential inference framework for DGP, where the data is processed sequentially. We also propose two DGP extensions to handle heteroscedasticity and multi-task learning. Our experimental evaluation shows the effectiveness of our sequential inference framework on a number of important learning tasks.

[1]  Carl E. Rasmussen,et al.  Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models , 2014, NIPS.

[2]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[3]  M. Opper Sparse Online Gaussian Processes , 2008 .

[4]  Yee Whye Teh,et al.  Semiparametric latent factor models , 2005, AISTATS.

[5]  Neil D. Lawrence,et al.  Nested Variational Compression in Deep Gaussian Processes , 2014, 1412.1370.

[6]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[7]  Andrew Gordon Wilson,et al.  Gaussian Process Regression Networks , 2011, ICML.

[8]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[9]  Simon J. Godsill,et al.  An Overview of Existing Methods and Recent Advances in Sequential Monte Carlo , 2007, Proceedings of the IEEE.

[10]  A. O'Hagan,et al.  Bayesian inference for non‐stationary spatial covariance structure via spatial deformations , 2003 .

[11]  Timothy J. Robinson,et al.  Sequential Monte Carlo Methods in Practice , 2003 .

[12]  Neil D. Lawrence,et al.  Sparse Convolved Gaussian Processes for Multi-output Regression , 2008, NIPS.

[13]  Marcus R. Frean,et al.  Dependent Gaussian Processes , 2004, NIPS.

[14]  Miguel Lázaro-Gredilla,et al.  Bayesian Warped Gaussian Processes , 2012, NIPS.

[15]  B. Silverman,et al.  Some Aspects of the Spline Smoothing Approach to Non‐Parametric Regression Curve Fitting , 1985 .

[16]  Lawrence K. Saul,et al.  Kernel Methods for Deep Learning , 2009, NIPS.

[17]  Wolfram Burgard,et al.  Most likely heteroscedastic Gaussian process regression , 2007, ICML '07.

[18]  Iain Murray,et al.  A framework for evaluating approximation methods for Gaussian process regression , 2012, J. Mach. Learn. Res..

[19]  Yali Wang,et al.  Bayesian Filtering with Online Gaussian Process Latent Variable Models , 2014, UAI.

[20]  Carl E. Rasmussen,et al.  Warped Gaussian Processes , 2003, NIPS.

[21]  Neil D. Lawrence,et al.  Hierarchical Gaussian process latent variable models , 2007, ICML '07.

[22]  Miguel Lázaro-Gredilla,et al.  Variational Heteroscedastic Gaussian Process Regression , 2011, ICML.

[23]  Ryan P. Adams,et al.  Gaussian process product models for nonparametric nonstationarity , 2008, ICML '08.

[24]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[25]  Neil D. Lawrence,et al.  Variational Gaussian Process Dynamical Systems , 2011, NIPS.

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Lehel Csató,et al.  Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[28]  Miguel Lázaro-Gredilla,et al.  Kernel Recursive Least-Squares Tracker for Time-Varying Regression , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Michael I. Jordan,et al.  Regression with input-dependent noise: A Gaussian process treatment , 1998 .

[30]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[31]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[32]  Edwin V. Bonilla,et al.  Efficient Variational Inference for Gaussian Process Regression Networks , 2013, AISTATS.

[33]  Neil D. Lawrence,et al.  Computationally Efficient Convolved Multiple Output Gaussian Processes , 2011, J. Mach. Learn. Res..

[34]  Mark J. Schervish,et al.  Nonstationary Covariance Functions for Gaussian Process Regression , 2003, NIPS.

[35]  Ryan P. Adams,et al.  Avoiding pathologies in very deep networks , 2014, AISTATS.

[36]  Neil D. Lawrence,et al.  Bayesian Gaussian Process Latent Variable Model , 2010, AISTATS.