Covariances, Robustness, and Variational Bayes

Variational Bayes (VB) is an approximate Bayesian posterior inference technique that is increasingly popular due to its fast runtimes on large-scale datasets. However, even when VB provides accurate posterior means for certain parameters, it often mis-estimates variances and covariances. Furthermore, prior robustness measures have remained undeveloped for VB. By deriving a simple formula for the effect of infinitesimal model perturbations on VB posterior means, we provide both improved covariance estimates and local robustness measures for VB, thus greatly expanding the practical usefulness of VB posterior approximations. The estimates for VB posterior covariances rely on a result from the classical Bayesian robustness literature relating derivatives of posterior expectations to posterior covariances. Our key assumption is that the VB approximation provides good estimates of a select subset of posterior means -- an assumption that has been shown to hold in many practical settings. In our experiments, we demonstrate that our methods are simple, general, and fast, providing accurate posterior uncertainty estimates and robustness measures with runtimes that can be an order of magnitude smaller than MCMC.

[1]  H. Anton,et al.  FUNCTIONS OF SEVERAL VARIABLES , 1982 .

[2]  D. Freedman,et al.  On the consistency of Bayes estimates , 1986 .

[3]  R. Cook Assessment of Local Influence , 1986 .

[4]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[5]  P. Gustafson Local Sensitivity of Inferences to Prior Marginals , 1996 .

[6]  S. R. Jammalamadaka,et al.  Local Posterior Robustness with Parametric Priors: Maximum and Average Sensitivity , 1996 .

[7]  P. Gustafson Local sensitivity of posterior expectations , 1996 .

[8]  Hilbert J. Kappen,et al.  Efficient Learning in Boltzmann Machines Using Linear Response Theory , 1998, Neural Computation.

[9]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[10]  Toshiyuki TANAKA Mean-field theory of Boltzmann machine learning , 1998 .

[11]  Toshiyuki Tanaka,et al.  Information Geometry of Mean-Field Approximation , 2000, Neural Computation.

[12]  David Ríos Insua,et al.  Topics on the Foundations of Robust Bayesian Analysis , 2000 .

[13]  David Ríos Insua,et al.  Robust Bayesian analysis , 2000 .

[14]  Paul Gustafson,et al.  Local Robustness in Bayesian Analysis , 2000 .

[15]  E. Moreno Global Bayesian Robustness for Some Classes of Prior Distributions , 2000 .

[16]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[17]  M. Opper,et al.  Advanced mean field methods: theory and practice , 2001 .

[18]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[19]  T. N. Sriram Asymptotics in Statistics–Some Basic Concepts , 2002 .

[20]  Ole Winther,et al.  Variational Linear Response , 2003, NIPS.

[21]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[22]  Yee Whye Teh,et al.  Linear Response Algorithms for Approximate Inference in Graphical Models , 2004, Neural Computation.

[23]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[24]  Bo Wang,et al.  Inadequacy of interval estimates corresponding to variational Bayesian approximations , 2005, AISTATS.

[25]  Carlos J. Perez,et al.  MCMC-based local parametric sensitivity estimations , 2006, Comput. Stat. Data Anal..

[26]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[27]  J. Ibrahim,et al.  Perturbation selection and influence measures in local influence analysis , 2007, 0803.2986.

[28]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[29]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[30]  Dorota Kurowicka,et al.  Generating random correlation matrices based on vines and extended onion method , 2009, J. Multivar. Anal..

[31]  R. Keener Theoretical Statistics: Topics for a Core Course , 2010 .

[32]  Michael Schaub,et al.  Bayesian Population Analysis using WinBUGS: A Hierarchical Perspective , 2011 .

[33]  J. Ibrahim,et al.  Bayesian influence analysis: a geometric approach. , 2011, Biometrika.

[34]  Richard E. Turner,et al.  Two problems with variational expectation maximisation for time-series models , 2011 .

[35]  Leonhard Held,et al.  Sensitivity analysis for Bayesian hierarchical models , 2013, 1312.4797.

[36]  Miguel Lázaro-Gredilla,et al.  Doubly Stochastic Variational Bayes for non-Conjugate Inference , 2014, ICML.

[37]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[38]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[39]  Prabhat,et al.  Celeste : Scalable variational inference for a generative model of astronomical images , 2014 .

[40]  Bob Carpenter,et al.  The Stan Math Library: Reverse-Mode Automatic Differentiation in C++ , 2015, ArXiv.

[41]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[42]  Prabhat,et al.  Celeste: Variational inference for a generative model of astronomical images , 2015, ICML.

[43]  Michael I. Jordan,et al.  Linear Response Methods for Accurate Covariance Estimates from Mean Field Variational Bayes , 2015, NIPS.

[44]  B. Efron Frequentist accuracy of Bayesian estimates , 2015, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[45]  Edoardo M. Airoldi,et al.  Copula variational inference , 2015, NIPS.

[46]  T. Westling,et al.  Establishing consistency and improving uncertainty estimates of variational inference through M-estimation , 2015 .

[47]  Michael I. Jordan,et al.  Fast robustness quantification with variational Bayes , 2016, 1606.07153.

[48]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[49]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[50]  Dustin Tran,et al.  Variational Gaussian Process , 2015, ICLR.

[51]  Dustin Tran,et al.  Hierarchical Variational Models , 2015, ICML.

[52]  Dustin Tran,et al.  Automatic Differentiation Variational Inference , 2016, J. Mach. Learn. Res..

[53]  Barak A. Pearlmutter,et al.  Automatic differentiation in machine learning: a survey , 2015, J. Mach. Learn. Res..

[54]  David M. Blei,et al.  Frequentist Consistency of Variational Bayes , 2017, Journal of the American Statistical Association.

[55]  Tyler H. McCormick,et al.  Beyond Prediction: A Framework for Inference With Variational Approximations in Mixture Models , 2015, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.