Scalable Inference for Gaussian Process Models with Black-Box Likelihoods

We propose a sparse method for scalable automated variational inference (AVI) in a large class of models with Gaussian process (GP) priors, multiple latent functions, multiple outputs and non-linear likelihoods. Our approach maintains the statistical efficiency property of the original AVI method, requiring only expectations over univariate Gaussian distributions to approximate the posterior with a mixture of Gaussians. Experiments on small datasets for various problems including regression, classification, Log Gaussian Cox processes, and warped GPs show that our method can perform as well as the full method under high sparsity levels. On larger experiments using the MNIST and the SARCOS datasets we show that our method can provide superior performance to previously published scalable approaches that have been handcrafted to specific likelihood models.

[1]  R. Jarrett A note on the intervals between coal-mining disasters , 1979 .

[2]  J. Møller,et al.  Log Gaussian Cox Processes , 1998 .

[3]  David Barber,et al.  Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Stefan Schaal,et al.  Locally Weighted Projection Regression : An O(n) Algorithm for Incremental Real Time Learning in High Dimensional Space , 2000 .

[5]  Stefan Schaal,et al.  Locally Weighted Projection Regression: Incremental Real Time Learning in High Dimensional Space , 2000, ICML.

[6]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[7]  Carl E. Rasmussen,et al.  Warped Gaussian Processes , 2003, NIPS.

[8]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[9]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[10]  Joshua B. Tenenbaum,et al.  Church: a language for generative models , 2008, UAI.

[11]  C. Rasmussen,et al.  Approximations for Binary Gaussian Process Classification , 2008 .

[12]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[13]  Manfred Opper,et al.  The Variational Gaussian Approximation Revisited , 2009, Neural Computation.

[14]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[15]  Ryan P. Adams,et al.  Elliptical slice sampling , 2009, AISTATS.

[16]  Neil D. Lawrence,et al.  Efficient Multioutput Gaussian Processes through Variational Inducing Kernels , 2010, AISTATS.

[17]  Neil D. Lawrence,et al.  Computationally Efficient Convolved Multiple Output Gaussian Processes , 2011, J. Mach. Learn. Res..

[18]  Andrew Gordon Wilson,et al.  Gaussian Process Regression Networks , 2011, ICML.

[19]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[20]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[21]  Edwin V. Bonilla,et al.  Collaborative Multi-output Gaussian Processes , 2014, UAI.

[22]  Carl E. Rasmussen,et al.  Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models , 2014, NIPS.

[23]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[24]  Edwin V. Bonilla,et al.  Fast Allocation of Gaussian Process Experts , 2014, ICML.

[25]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[26]  Edwin V. Bonilla,et al.  Automated Variational Inference for Gaussian Process Models , 2014, NIPS.

[27]  James Hensman,et al.  Scalable Variational Gaussian Process Classification , 2014, AISTATS.

[28]  Le Song,et al.  A la Carte - Learning Fast Kernels , 2014, AISTATS.

[29]  Pedro M. Domingos,et al.  Unifying Logical and Statistical AI , 2006, 2016 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).