Fast Dual Variational Inference for Non-Conjugate Latent Gaussian Models

Latent Gaussian models (LGMs) are widely used in statistics and machine learning. Bayesian inference in non-conjugate LGMs is difficult due to intractable integrals involving the Gaussian prior and nonconjugate likelihoods. Algorithms based on variational Gaussian (VG) approximations are widely employed since they strike a favorable balance between accuracy, generality, speed, and ease of use. However, the structure of the optimization problems associated with these approximations remains poorly understood, and standard solvers take too long to converge. We derive a novel dual variational inference approach that exploits the convexity property of the VG approximations. We obtain an algorithm that solves a convex optimization problem, reduces the number of variational parameters, and converges much faster than previous methods. Using real-world data, we demonstrate these advantages on a variety of LGMs, including Gaussian process classification, and latent Gaussian Markov random fields.

[1]  D. Sontag 1 Introduction to Dual Decomposition for Inference , 2010 .

[2]  Mohammad Emtiyaz Khan,et al.  Fast Bayesian Inference for Non-Conjugate Gaussian Process Regression , 2012, NIPS.

[3]  Manfred Opper,et al.  The Variational Gaussian Approximation Revisited , 2009, Neural Computation.

[4]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[5]  M. Seeger Sparse linear models: Variational approximate inference and Bayesian experimental design , 2009 .

[6]  Mohammad Emtiyaz Khan,et al.  Variational learning for latent Gaussian model of discrete data , 2012 .

[7]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[8]  Mohammad Emtiyaz Khan,et al.  A Stick-Breaking Likelihood for Categorical Data Analysis with Latent Gaussian Models , 2012, AISTATS.

[9]  Miguel Lázaro-Gredilla,et al.  Variational Heteroscedastic Gaussian Process Regression , 2011, ICML.

[10]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[11]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[12]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[13]  Tom Minka,et al.  Non-conjugate Variational Message Passing for Multinomial and Binary Regression , 2011, NIPS.

[14]  Stephen Gould,et al.  Accelerated dual decomposition for MAP inference , 2010, ICML.

[15]  Mark Girolami,et al.  Variational Bayesian Multinomial Probit Regression with Gaussian Process Priors , 2006, Neural Computation.

[16]  Mohammad Emtiyaz Khan,et al.  Piecewise Bounds for Estimating Bernoulli-Logistic Latent Gaussian Models , 2011, ICML.

[17]  Jon D. McAuliffe,et al.  Variational Inference for Large-Scale Models of Discrete Choice , 2007, 0712.2526.

[18]  Florian Steinke,et al.  Bayesian Inference and Optimal Design in the Sparse Linear Model , 2007, AISTATS.

[19]  Juha Karhunen,et al.  Approximate Riemannian Conjugate Gradient Learning for Fixed-Form Variational Bayes , 2010, J. Mach. Learn. Res..

[20]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[21]  Tommi S. Jaakkola,et al.  Introduction to dual composition for inference , 2011 .

[22]  David Barber,et al.  Concave Gaussian Variational Approximations for Inference in Large-Scale Bayesian Linear Models , 2011, AISTATS.

[23]  Matthias W. Seeger,et al.  Large Scale Bayesian Inference and Experimental Design for Sparse Linear Models , 2011, SIAM J. Imaging Sci..

[24]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[25]  C. Rasmussen,et al.  Approximations for Binary Gaussian Process Classification , 2008 .