Gaussian variational approximation with sparse precision matrices

We consider the problem of learning a Gaussian variational approximation to the posterior distribution for a high-dimensional parameter, where we impose sparsity in the precision matrix to reflect appropriate conditional independence structure in the model. Incorporating sparsity in the precision matrix allows the Gaussian variational distribution to be both flexible and parsimonious, and the sparsity is achieved through parameterization in terms of the Cholesky factor. Efficient stochastic gradient methods that make appropriate use of gradient information for the target distribution are developed for the optimization. We consider alternative estimators of the stochastic gradients, which have lower variation and are more stable. Our approach is illustrated using generalized linear mixed models and state-space models for time series.

[1]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[2]  Hagai Attias,et al.  Inferring Parameters and Structure of Latent Variable Models by Variational Bayes , 1999, UAI.

[3]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[4]  M. Wand,et al.  Gaussian Variational Approximate Inference for Generalized Linear Mixed Models , 2012 .

[5]  N. Shephard,et al.  Stochastic Volatility: Likelihood Inference And Comparison With Arch Models , 1996 .

[6]  Manfred Opper,et al.  The Variational Gaussian Approximation Revisited , 2009, Neural Computation.

[7]  David B. Dunson,et al.  Variational Gaussian Copula Inference , 2015, AISTATS.

[8]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[9]  J. Booth,et al.  Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm , 1999 .

[10]  Miguel Lázaro-Gredilla,et al.  Doubly Stochastic Variational Bayes for non-Conjugate Inference , 2014, ICML.

[11]  Tim Salimans,et al.  Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression , 2012, ArXiv.

[12]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[13]  Linda S. L. Tan,et al.  A Stochastic Variational Framework for Fitting and Diagnosing Generalized Linear Mixed Models , 2012, 1208.4949.

[14]  David M. Blei,et al.  Deep Exponential Families , 2014, AISTATS.

[15]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[16]  Dustin Tran,et al.  Automatic Differentiation Variational Inference , 2016, J. Mach. Learn. Res..

[17]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[18]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[19]  M. Wand,et al.  Streamlined mean field variational Bayes for longitudinal and multilevel data analysis , 2016, Biometrical journal. Biometrische Zeitschrift.

[20]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[21]  R. Kohn,et al.  Regression Density Estimation With Variational Methods and Stochastic Approximation , 2012 .

[22]  M. De Backer,et al.  Twelve weeks of continuous oral therapy for toenail onychomycosis caused by dermatophytes: a double-blind comparative trial of terbinafine 250 mg/day versus itraconazole 200 mg/day. , 1998, Journal of American Academy of Dermatology.

[23]  Linda S. L. Tan,et al.  Variational Inference for Generalized Linear Mixed Models Using Partially Noncentered Parametrizations , 2012, 1205.3906.

[24]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[25]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[26]  David M. Blei,et al.  Stochastic Structured Variational Inference , 2014, AISTATS.

[27]  David Barber,et al.  Gaussian Kullback-Leibler approximate inference , 2013, J. Mach. Learn. Res..

[28]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[29]  M. West,et al.  Bounded Approximations for Marginal Likelihoods , 2010 .

[30]  David Rohde,et al.  Semiparametric Mean Field Variational Bayes: General Principles and Numerical Issues , 2016, J. Mach. Learn. Res..

[31]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[32]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[33]  Il Memming Park,et al.  BLACK BOX VARIATIONAL INFERENCE FOR STATE SPACE MODELS , 2015, 1511.07367.

[34]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[35]  Bo Wang,et al.  Inadequacy of interval estimates corresponding to variational Bayesian approximations , 2005, AISTATS.

[36]  Chong Wang,et al.  An Adaptive Learning Rate for Stochastic Variational Inference , 2013, ICML.

[37]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[38]  Rodney X. Sturdivant,et al.  Applied Logistic Regression: Hosmer/Applied Logistic Regression , 2005 .

[39]  Adam J. Rothman,et al.  A new approach to Cholesky-based covariance regularization in high dimensions , 2009, 0903.0645.

[40]  H. Robbins A Stochastic Approximation Method , 1951 .

[41]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[42]  David M. Blei,et al.  Nonparametric variational inference , 2012, ICML.

[43]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[44]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[45]  Matt P Wand,et al.  Variational methods for fitting complex Bayesian mixed effects models to health data , 2016, Statistics in medicine.

[46]  Miguel Lázaro-Gredilla,et al.  Local Expectation Gradients for Black Box Variational Inference , 2015, NIPS.

[47]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[48]  Charles M. Bishop,et al.  Variational Message Passing , 2005, J. Mach. Learn. Res..

[49]  P. Thall,et al.  Some covariance models for longitudinal count data with overdispersion. , 1990, Biometrics.

[50]  N. Shephard,et al.  Multivariate stochastic variance models , 1994 .

[51]  Siem Jan Koopman,et al.  Time Series Analysis by State Space Methods , 2001 .

[52]  M. Wand,et al.  Explaining Variational Approximations , 2010 .