Kullback-Leibler Proximal Variational Inference

We propose a new variational inference method based on a proximal framework that uses the Kullback-Leibler (KL) divergence as the proximal term. We make two contributions towards exploiting the geometry and structure of the variational bound. First, we propose a KL proximal-point algorithm and show its equivalence to variational inference with natural gradients (e.g., stochastic variational inference). Second, we use the proximal framework to derive efficient variational algorithms for non-conjugate models. We propose a splitting procedure to separate non-conjugate terms from conjugate ones. We linearize the non-conjugate terms to obtain subproblems that admit a closed-form solution. Overall, our approach converts inference in a non-conjugate model to subproblems that involve inference in well-known conjugate models. We show that our method is applicable to a wide variety of models and can result in computationally efficient algorithms. Applications to real-world datasets show comparable performances to existing methods.

[1]  Razvan Pascanu,et al.  Revisiting Natural Gradient for Deep Networks , 2013, ICLR.

[2]  Juha Karhunen,et al.  Approximate Riemannian Conjugate Gradient Learning for Fixed-Form Variational Bayes , 2010, J. Mach. Learn. Res..

[3]  Le Song,et al.  Provable Bayesian Inference via Particle Mirror Descent , 2015, AISTATS.

[4]  Paul Tseng,et al.  An Analysis of the EM Algorithm and Entropy-Like Proximal Point Methods , 2004, Math. Oper. Res..

[5]  Masa-aki Sato,et al.  Online Model Selection Based on the Variational Bayes , 2001, Neural Computation.

[6]  Alfred O. Hero,et al.  Kullback proximal algorithims for maximum-likelihood estimation , 2000, IEEE Trans. Inf. Theory.

[7]  Martin J. Wainwright,et al.  Message-passing for graph-structured linear programs: proximal projections, convergence and rounding schemes , 2008, ICML '08.

[8]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[9]  Arindam Banerjee,et al.  Bregman Alternating Direction Method of Multipliers , 2013, NIPS.

[10]  Antti Honkela,et al.  Bayesian Non-Linear Independent Component Analysis by Multi-Layer Perceptrons , 2000 .

[11]  Tim Salimans,et al.  Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression , 2012, ArXiv.

[12]  Ulrich Paquet On the Convergence of Stochastic Variational Inference in Bayesian Networks , 2014 .

[13]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[14]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[15]  Matthew D. Hoffman,et al.  A trust-region method for stochastic variational inference with applications to streaming data , 2015, ICML.

[16]  Mohammad Emtiyaz Khan,et al.  Piecewise Bounds for Estimating Bernoulli-Logistic Latent Gaussian Models , 2011, ICML.

[17]  Le Song,et al.  Scalable Bayesian Inference via Particle Mirror Descent , 2015, ArXiv.

[18]  Marc Teboulle,et al.  Convergence of Proximal-Like Algorithms , 1997, SIAM J. Optim..

[19]  Vladimir Pavlovic,et al.  D-MFVI: Distributed Mean Field Variational Inference using Bregman ADMM , 2015, ArXiv.

[20]  David Barber,et al.  Concave Gaussian Variational Approximations for Inference in Large-Scale Bayesian Linear Models , 2011, AISTATS.

[21]  Matthias W. Seeger,et al.  Large Scale Bayesian Inference and Experimental Design for Sparse Linear Models , 2011, SIAM J. Imaging Sci..

[22]  Mark W. Schmidt,et al.  Convergence of Proximal-Gradient Stochastic Variational Inference under Non-Decreasing Step-Size Sequence , 2015, ArXiv.

[23]  Miguel Lázaro-Gredilla,et al.  Doubly Stochastic Variational Bayes for non-Conjugate Inference , 2014, ICML.

[24]  James G. Scott,et al.  Proximal Algorithms in Statistics and Machine Learning , 2015, ArXiv.

[25]  Mohammad E. Khan,et al.  Decoupled Variational Gaussian Inference , 2014, NIPS.

[26]  Antti Honkela,et al.  Unsupervised Variational Bayesian Learning of Nonlinear Models , 2004, NIPS.

[27]  Chong Wang,et al.  Variational inference in nonconjugate models , 2012, J. Mach. Learn. Res..