Deep Gaussian Processes with Importance-Weighted Variational Inference

Deep Gaussian processes (DGPs) can model complex marginal densities as well as complex mappings. Non-Gaussian marginals are essential for modelling real-world data, and can be generated from the DGP by incorporating uncorrelated variables to the model. Previous work on DGP models has introduced noise additively and used variational inference with a combination of sparse Gaussian processes and mean-field Gaussians for the approximate posterior. Additive noise attenuates the signal, and the Gaussian form of variational distribution may lead to an inaccurate posterior. We instead incorporate noisy variables as latent covariates, and propose a novel importance-weighted objective, which leverages analytic results and provides a mechanism to trade off computation for improved accuracy. Our results demonstrate that the importance-weighted objective works well in practice and consistently outperforms classical variational inference, especially for deeper models.

[1]  Juan José Murillo-Fuentes,et al.  Inference in Deep Gaussian Processes using Stochastic Gradient Hamiltonian Monte Carlo , 2018, NeurIPS.

[2]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[3]  Joshua B. Tenenbaum,et al.  Automatic Construction and Natural-Language Description of Nonparametric Regression Models , 2014, AAAI.

[4]  Carl Henrik Ek,et al.  Latent Gaussian Process Regression , 2017, ArXiv.

[5]  Yee Whye Teh,et al.  Tighter Variational Bounds are Not Necessarily Better , 2018, ICML.

[6]  James Hensman,et al.  On Sparse Variational Methods and the Kullback-Leibler Divergence between Stochastic Processes , 2015, AISTATS.

[7]  Neil D. Lawrence,et al.  Bayesian Gaussian Process Latent Variable Model , 2010, AISTATS.

[8]  George Tucker,et al.  Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives , 2019, ICLR.

[9]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[10]  Andrew M. Stuart,et al.  How Deep Are Deep Gaussian Processes? , 2017, J. Mach. Learn. Res..

[11]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[14]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Marc Peter Deisenroth,et al.  Doubly Stochastic Variational Inference for Deep Gaussian Processes , 2017, NIPS.

[17]  Radford M. Neal Regression and Classification Using Gaussian Process Priors , 2009 .

[18]  Katja Hofmann,et al.  Meta Reinforcement Learning with Latent Variable Gaussian Processes , 2018, UAI.

[19]  Radford M. Neal,et al.  Gaussian Process Regression with Heteroscedastic or Non-Gaussian Residuals , 2012, ArXiv.

[20]  Neil D. Lawrence,et al.  Variational Auto-encoded Deep Gaussian Processes , 2015, ICLR.

[21]  Guodong Zhang,et al.  Differentiable Compositional Kernel Learning for Gaussian Processes , 2018, ICML.

[22]  James Hensman,et al.  Gaussian Process Conditional Density Estimation , 2018, NeurIPS.

[23]  Ruslan Salakhutdinov,et al.  On the Quantitative Analysis of Decoder-Based Generative Models , 2016, ICLR.

[24]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[25]  Neil D. Lawrence,et al.  Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes , 2017, NIPS.

[26]  Daniel Hernández-Lobato,et al.  Deep Gaussian Processes for Regression using Approximate Expectation Propagation , 2016, ICML.

[27]  Neil D. Lawrence,et al.  Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[28]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[29]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[30]  Neil D. Lawrence,et al.  Semi-described and semi-supervised learning with Gaussian processes , 2015, UAI.

[31]  Alexis Boukouvalas,et al.  GPflow: A Gaussian Process Library using TensorFlow , 2016, J. Mach. Learn. Res..

[32]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[33]  Ryan P. Adams,et al.  Avoiding pathologies in very deep networks , 2014, AISTATS.

[34]  Byron Boots,et al.  Incremental Variational Sparse Gaussian Process Regression , 2016, NIPS.

[35]  El-Ghazali Talbi,et al.  Bayesian optimization using deep Gaussian processes with applications to aerospace system design , 2019, Optimization and Engineering.

[36]  Neil D. Lawrence,et al.  Fast Variational Inference in the Conjugate Exponential Family , 2012, NIPS.

[37]  A. P. Dawid,et al.  Regression and Classification Using Gaussian Process Priors , 2009 .

[38]  James Hensman,et al.  Natural Gradients in Practice: Non-Conjugate Variational Inference in Gaussian Process Models , 2018, AISTATS.

[39]  Justin Domke,et al.  Importance Weighting and Variational Inference , 2018, NeurIPS.