Concave Gaussian Variational Approximations for Inference in Large-Scale Bayesian Linear Models

Two popular approaches to forming bounds in approximate Bayesian inference are local variational methods and minimal KullbackLeibler divergence methods. For a large class of models we explicitly relate the two approaches, showing that the local variational method is equivalent to a weakened form of Kullback-Leibler Gaussian approximation. This gives a strong motivation to develop efcient methods for KL minimisation. An important and previously unproven property of the KL variational Gaussian bound is that it is a concave function in the parameters of the Gaussian for log concave sites. This observation, along with compact concave parametrisations of the covariance, enables us to develop fast scalable optimisation procedures to obtain lower bounds on the marginal likelihood in large scale Bayesian linear models.

[1]  Michael I. Jordan,et al.  A Variational Approach to Bayesian Logistic Regression Models and their Extensions , 1997, AISTATS.

[2]  Charles M. Bishop,et al.  Ensemble learning in Bayesian neural networks , 1998 .

[3]  Matthias W. Seeger,et al.  Bayesian Model Selection for Support Vector Machines, Gaussian Processes and Other Kernel Classifiers , 1999, NIPS.

[4]  David J. C. MacKay,et al.  Variational Gaussian process classifiers , 2000, IEEE Trans. Neural Networks Learn. Syst..

[5]  Michael I. Jordan,et al.  Bayesian parameter estimation via variational methods , 2000, Stat. Comput..

[6]  J. Hardin,et al.  Generalized Linear Models and Extensions , 2001 .

[7]  Mark A. Girolami,et al.  A Variational Method for Learning Sparse and Overcomplete Representations , 2001, Neural Computation.

[8]  Bhaskar D. Rao,et al.  Variational EM Algorithms for Non-Gaussian Latent Variable Models , 2005, NIPS.

[9]  Carl E. Rasmussen,et al.  Assessing Approximate Inference for Binary Gaussian Process Classification , 2005, J. Mach. Learn. Res..

[10]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[11]  C. Rasmussen,et al.  Approximations for Binary Gaussian Process Classification , 2008 .

[12]  M. Seeger Sparse linear models: Variational approximate inference and Bayesian experimental design , 2009 .

[13]  Matthias W. Seeger,et al.  Large Scale Variational Inference and Experimental Design for Sparse Generalized Linear Models , 2008, Sampling-based Optimization in the Presence of Uncertainty.

[14]  Matthias W. Seeger,et al.  Convex variational Bayesian inference for large scale generalized linear models , 2009, ICML '09.

[15]  Manfred Opper,et al.  The Variational Gaussian Approximation Revisited , 2009, Neural Computation.

[16]  David P. Wipf,et al.  A unified Bayesian framework for MEG/EEG source imaging , 2009, NeuroImage.

[17]  David Barber,et al.  Bayesian reasoning and machine learning , 2012 .