Gaussian Process Conditional Density Estimation

Conditional Density Estimation (CDE) models deal with estimating conditional distributions. The conditions imposed on the distribution are the inputs of the model. CDE is a challenging task as there is a fundamental trade-off between model complexity, representational capacity and overfitting. In this work, we propose to extend the model's input with latent variables and use Gaussian processes (GP) to map this augmented input onto samples from the conditional distribution. Our Bayesian approach allows for the modeling of small datasets, but we also provide the machinery for it to be applied to big data using stochastic variational inference. Our approach can be used to model densities even in sparse data regions, and allows for sharing learned structure between conditions. We illustrate the effectiveness and wide-reaching applicability of our model on a variety of real-world problems, such as spatio-temporal density estimation of taxi drop-offs, non-Gaussian noise modeling, and few-shot learning on omniglot images.

[1]  Neil D. Lawrence,et al.  Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes , 2017, NIPS.

[2]  Richard E. Turner,et al.  Stochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints , 2015 .

[3]  Radford M. Neal,et al.  Gaussian Process Regression with Heteroscedastic or Non-Gaussian Residuals , 2012, ArXiv.

[4]  Neil D. Lawrence,et al.  Variational Auto-encoded Deep Gaussian Processes , 2015, ICLR.

[5]  Ruslan Salakhutdinov,et al.  On the Quantitative Analysis of Decoder-Based Generative Models , 2016, ICLR.

[6]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[7]  Ryan P. Adams,et al.  Nonparametric Bayesian Density Modeling with Gaussian Processes , 2009, 0912.4896.

[8]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[9]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[10]  S. Srihari Mixture Density Networks , 1994 .

[11]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[12]  Aapo Hyvärinen,et al.  Density Estimation in Infinite Dimensional Exponential Families , 2013, J. Mach. Learn. Res..

[13]  Finale Doshi-Velez,et al.  Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks , 2016, ICLR.

[14]  Neil D. Lawrence,et al.  Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[15]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[16]  Carl Henrik Ek,et al.  Latent Gaussian Process Regression , 2017, ArXiv.

[17]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[18]  Neil D. Lawrence,et al.  Bayesian Gaussian Process Latent Variable Model , 2010, AISTATS.

[19]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[20]  Joaquin Quiñonero Candela,et al.  Local distance preservation in the GP-LVM through back constraints , 2006, ICML.

[21]  James Hensman,et al.  On Sparse Variational Methods and the Kullback-Leibler Divergence between Stochastic Processes , 2015, AISTATS.

[22]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[23]  C. Rasmussen,et al.  Gaussian Process Priors with Uncertain Inputs - Application to Multiple-Step Ahead Time Series Forecasting , 2002, NIPS.

[24]  Arthur Gretton,et al.  Kernel Conditional Exponential Family , 2017, AISTATS.

[25]  Richard E. Turner,et al.  Conditional Density Estimation with Bayesian Normalising Flows , 2018, 1802.04908.

[26]  Neil D. Lawrence,et al.  Semi-described and semi-supervised learning with Gaussian processes , 2015, UAI.

[27]  David Duvenaud,et al.  Inference Suboptimality in Variational Autoencoders , 2018, ICML.

[28]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[29]  C. Bishop Mixture density networks , 1994 .

[30]  Miguel Lázaro-Gredilla,et al.  Variational Inference for Mahalanobis Distance Metrics in Gaussian Process Regression , 2013, NIPS.