论文信息 - Gaussian Kullback-Leibler approximate inference

Gaussian Kullback-Leibler approximate inference

We investigate Gaussian Kullback-Leibler (G-KL) variational approximate inference techniques for Bayesian generalised linear models and various extensions. In particular we make the following novel contributions: sufficient conditions for which the G-KL objective is differentiable and convex are described; constrained parameterisations of Gaussian covariance that make G-KL methods fast and scalable are provided; the lower bound to the normalisation constant provided by G-KL methods is proven to dominate those provided by local lower bounding methods; complexity and model applicability issues of G-KL versus other Gaussian approximate inference methods are discussed. Numerical results comparing G-KL and other deterministic Gaussian approximate inference methods are presented for: robust Gaussian process regression models with either Student-t or Laplace likelihoods, large scale Bayesian binary logistic regression models, and Bayesian sparse linear models for sequential experimental design.

David Barber | Edward Challis | D. Barber | E. Challis | Edward Challis

[1] M. Seeger. Sparse linear models: Variational approximate inference and Bayesian experimental design , 2009 .

[2] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[3] Mark A. Girolami,et al. A Variational Method for Learning Sparse and Overcomplete Representations , 2001, Neural Computation.

[4] Charles M. Bishop,et al. Ensemble learning in Bayesian neural networks , 1998 .

[5] Bhaskar D. Rao,et al. Sparse Bayesian learning for basis selection , 2004, IEEE Transactions on Signal Processing.

[6] Carl E. Rasmussen,et al. Gaussian Processes for Machine Learning (GPML) Toolbox , 2010, J. Mach. Learn. Res..

[7] Aki Vehtari,et al. Robust Gaussian Process Regression with a Student-t Likelihood , 2011, J. Mach. Learn. Res..

[8] Bhaskar D. Rao,et al. Variational EM Algorithms for Non-Gaussian Latent Variable Models , 2005, NIPS.

[9] Michael I. Jordan,et al. Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..

[10] Matthias W. Seeger,et al. Convex variational Bayesian inference for large scale generalized linear models , 2009, ICML '09.

[11] Mark W. Schmidt,et al. Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches , 2007, ECML.

[12] David Barber,et al. Concave Gaussian Variational Approximations for Inference in Large-Scale Bayesian Linear Models , 2011, AISTATS.

[13] Matthias W. Seeger,et al. Large Scale Bayesian Inference and Experimental Design for Sparse Linear Models , 2011, SIAM J. Imaging Sci..

[14] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..

[15] C. J.,et al. Generalization Error and the Number of Hidden units in a MultilayerPerceptronDavid , 1995 .

[16] M. Seeger. Low Rank Updates for the Cholesky Decomposition , 2004 .

[17] Matthias W. Seeger,et al. Compressed sensing and Bayesian experimental design , 2008, ICML '08.

[18] Carl E. Rasmussen,et al. Assessing Approximate Inference for Binary Gaussian Process Classification , 2005, J. Mach. Learn. Res..

[19] Radford M. Neal. Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification , 1997, physics/9701026.

[20] M. Wand,et al. Gaussian Variational Approximate Inference for Generalized Linear Mixed Models , 2012 .

[21] Juha Karhunen,et al. Approximate Riemannian Conjugate Gradient Learning for Fixed-Form Variational Bayes , 2010, J. Mach. Learn. Res..

[22] C. Rasmussen,et al. Approximations for Binary Gaussian Process Classification , 2008 .

[23] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[24] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[25] M. Seeger. Bayesian methods for Support Vector machines and Gaussian processes , 1999 .

[26] Ole Winther,et al. Expectation Consistent Approximate Inference , 2005, J. Mach. Learn. Res..

[27] Matthias W. Seeger,et al. Gaussian Covariance and Scalable Variational Inference , 2010, ICML.

[28] David Barber,et al. Bayesian reasoning and machine learning , 2012 .

[29] Matthias Bethge,et al. Bayesian Inference for Sparse Generalized Linear Models , 2007, ECML.

[30] Florian Steinke,et al. Bayesian Inference and Optimal Design in the Sparse Linear Model , 2007, AISTATS.

[31] Tom Minka,et al. Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[32] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.

[33] Hannes Nickisch,et al. Bayesian inference and experimental design for large generalised linear models , 2010 .

[34] G. Casella,et al. The Bayesian Lasso , 2008 .

[35] Matthias W. Seeger,et al. Large Scale Variational Bayesian Inference for Structured Scale Mixture Models , 2012, ICML.

[36] Mohammad Emtiyaz Khan,et al. Piecewise Bounds for Estimating Bernoulli-Logistic Latent Gaussian Models , 2011, ICML.

[37] Matthias W. Seeger,et al. Bayesian Model Selection for Support Vector Machines, Gaussian Processes and Other Kernel Classifiers , 1999, NIPS.

[38] Aki Vehtari,et al. Gaussian process regression with Student-t likelihood , 2009, NIPS.

[39] Zoubin Ghahramani,et al. Optimization with EM and Expectation-Conjugate-Gradient , 2003, ICML.

[40] Manfred Opper,et al. The Variational Gaussian Approximation Revisited , 2009, Neural Computation.

[41] Michael E. Tipping. Probabilistic Visualisation of High-Dimensional Binary Data , 1998, NIPS.

[42] Sundaresh Ram,et al. Removing Camera Shake from a Single Photograph , 2009 .

[43] Matthias W. Seeger,et al. Large Scale Variational Inference and Experimental Design for Sparse Generalized Linear Models , 2008, Sampling-based Optimization in the Presence of Uncertainty.

[44] Michael I. Jordan,et al. A Variational Approach to Bayesian Logistic Regression Models and their Extensions , 1997, AISTATS.

[45] David J. C. MacKay,et al. Variational Gaussian process classifiers , 2000, IEEE Trans. Neural Networks Learn. Syst..

[46] Hannes Nickisch. glm-ie: Generalised Linear Models Inference & Estimation Toolbox , 2012, J. Mach. Learn. Res..

[47] George Papandreou,et al. Gaussian sampling by local perturbations , 2010, NIPS.

[48] J. Freidman,et al. Multivariate adaptive regression splines , 1991 .

[49] R. Herbrich. On Gaussian Expectation Propagation , 2005 .

[50] D. Field,et al. Natural image statistics and efficient coding. , 1996, Network.

[51] Malte Kuß,et al. Gaussian process models for robust regression, classification, and reinforcement learning , 2006 .