Scalable Gaussian Process Classification via Expectation Propagation

Variational methods have been recently considered for scaling the training process of Gaussian process classifiers to large datasets. As an alternative, we describe here how to train these classifiers efficiently using expectation propagation. The proposed method allows for handling datasets with millions of data instances. More precisely, it can be used for (i) training in a distributed fashion where the data instances are sent to different nodes in which the required computations are carried out, and for (ii) maximizing an estimate of the marginal likelihood using a stochastic approximation of the gradient. Several experiments indicate that the method described is competitive with the variational approach.

[1]  Yee Whye Teh,et al.  Distributed Bayesian Posterior Sampling via Moment Sharing , 2014, NIPS.

[2]  Sean B. Holden,et al.  The Generalized FITC Approximation , 2007, NIPS.

[3]  Carl E. Rasmussen,et al.  Gaussian Processes for Machine Learning (GPML) Toolbox , 2010, J. Mach. Learn. Res..

[4]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[5]  Daniel Hernández-Lobato,et al.  Robust Multi-Class Gaussian Process Classification , 2011, NIPS.

[6]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[7]  C. Rasmussen,et al.  Approximations for Binary Gaussian Process Classification , 2008 .

[8]  Richard E. Turner,et al.  Stochastic Expectation Propagation , 2015, NIPS.

[9]  Yuan Qi,et al.  Sparse-posterior Gaussian Processes for general likelihoods , 2010, UAI.

[10]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[11]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[12]  D. Hernández-Lobato Prediction based on averages over automatically induced learners: ensemble methods and Bayesian techniques , 2009 .

[13]  Carl E. Rasmussen,et al.  Assessing Approximate Inference for Binary Gaussian Process Classification , 2005, J. Mach. Learn. Res..

[14]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[15]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[16]  Edward Lloyd Snelson,et al.  Flexible and efficient Gaussian process models for machine learning , 2007 .

[17]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[18]  James Hensman,et al.  Scalable Variational Gaussian Process Classification , 2014, AISTATS.

[19]  T. Heskes,et al.  Expectation propagation for approximate inference in dynamic bayesian networks , 2002, UAI 2002.

[20]  M. Seeger Expectation Propagation for Exponential Families , 2005 .

[21]  Ole Winther,et al.  Predictive active set selection methods for Gaussian processes , 2012, Neurocomputing.

[22]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[23]  Tom Heskes,et al.  Bayesian Source Localization with the Multivariate Laplace Prior , 2009, NIPS.

[24]  Aki Vehtari,et al.  GPstuff: Bayesian modeling with Gaussian processes , 2013, J. Mach. Learn. Res..

[25]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[26]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..