How good is your Laplace approximation of the Bayesian posterior? Finite-sample computable error bounds for a variety of useful divergences

The Laplace approximation is a popular method for providing posterior mean and variance estimates. But can we trust these estimates for practical use? One might consider using rate-of-convergence bounds for the Bayesian Central Limit Theorem (BCLT) to provide quality guarantees for the Laplace approximation. But the bounds in existing versions of the BCLT either: require knowing the true data-generating parameter, are asymptotic in the number of samples, do not control the Bayesian posterior mean, or apply only to narrow classes of models. Our work provides the first closed-form, finite-sample quality bounds for the Laplace approximation that simultaneously (1) do not require knowing the true parameter, (2) control posterior means and variances, and (3) apply generally to models that satisfy the conditions of the asymptotic BCLT. In fact, our bounds work even in the presence of misspecification. We compute exact constants in our bounds for a variety of standard models, including logistic regression, and numerically demonstrate their utility. We provide a framework for analysis of more complex models.

[1]  A. Katsevich Tight skew adjustment to the Laplace approximation in high dimensions , 2023, 2306.07262.

[2]  A. Katsevich Tight Bounds on the Laplace Approximation Accuracy in High Dimensions , 2023, 2305.17604.

[3]  P. Rigollet,et al.  On the Approximation Accuracy of Gaussian Variational Inference , 2023, 2301.02168.

[4]  Robert E. Gaunt,et al.  Normal approximation for the posterior in exponential families , 2022, 2209.08806.

[5]  D. Rudolf,et al.  Wasserstein convergence rates of increasingly concentrating probability measures , 2022, 2207.08551.

[6]  V. Spokoiny Dimension free non-asymptotic bounds on the accuracy of high dimensional Laplace approximation , 2022, ArXiv.

[7]  Murat A. Erdogdu,et al.  Towards a Theory of Non-Log-Concave Sampling: First-Order Stationarity Guarantees for Langevin Monte Carlo , 2022, COLT.

[8]  Erik A. Daxberger,et al.  Laplace Redux - Effortless Bayesian Deep Learning , 2021, NeurIPS.

[9]  T. Helin,et al.  Non-asymptotic error estimates for the Laplace approximation in Bayesian inverse problems , 2020, Numerische Mathematik.

[10]  O. Papaspiliopoulos High-Dimensional Probability: An Introduction with Applications in Data Science , 2020 .

[11]  Murat A. Erdogdu,et al.  Convergence of Langevin Monte Carlo in Chi-Squared and Rényi Divergence , 2020, AISTATS.

[12]  Jeffrey W. Miller Asymptotic Normality, Concentration, and Coverage of Generalized Posteriors , 2019, J. Mach. Learn. Res..

[13]  G. Reinert,et al.  First-order covariance inequalities via Stein’s method , 2019, Bernoulli.

[14]  Guillaume P. Dehaene A deterministic and computable Bernstein-von Mises theorem , 2019, ArXiv.

[15]  Santosh S. Vempala,et al.  Rapid Convergence of the Unadjusted Langevin Algorithm: Isoperimetry Suffices , 2019, NeurIPS.

[16]  Bjorn Sprungk,et al.  On the convergence of the Laplace approximation and noise-level-robustness of Laplace-based Monte Carlo methods for Bayesian inverse problems , 2019, Numerische Mathematik.

[17]  André Schlichting,et al.  Poincaré and Log–Sobolev Inequalities for Mixtures , 2018, Entropy.

[18]  Trevor Campbell,et al.  Practical bounds on the error of Bayesian posterior approximations: A nonasymptotic approach , 2018, ArXiv.

[19]  Lester W. Mackey,et al.  Measuring Sample Quality with Kernels , 2017, ICML.

[20]  Lester W. Mackey,et al.  Measuring Sample Quality with Diffusions , 2016, The Annals of Applied Probability.

[21]  Qiang Liu,et al.  A Kernelized Stein Discrepancy for Goodness-of-fit Tests , 2016, ICML.

[22]  Arthur Gretton,et al.  A Kernel Test of Goodness of Fit , 2016, ICML.

[23]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[24]  Emanuel Milman,et al.  Riemannian metrics on convex sets with applications to Poincaré and log-Sobolev inequalities , 2015, 1510.02971.

[25]  Lester W. Mackey,et al.  Measuring Sample Quality with Stein's Method , 2015, NIPS.

[26]  Kean Ming Tan,et al.  Laplace Approximation in High-Dimensional Bayesian Regression , 2015, 1503.08337.

[27]  N. Chopin,et al.  Control functionals for Monte Carlo integration , 2014, 1410.2392.

[28]  J. Wellner,et al.  Log-Concavity and Strong Log-Concavity: a review. , 2014, Statistics surveys.

[29]  M. Ledoux,et al.  Analysis and Geometry of Markov Diffusion Operators , 2013 .

[30]  V. Spokoiny,et al.  Finite Sample Bernstein -- von Mises Theorem for Semiparametric Problems , 2013, 1310.7796.

[31]  Matthew T. Harrison,et al.  Inconsistency of Pitman-Yor process mixtures for the number of components , 2013, J. Mach. Learn. Res..

[32]  Pier Giovanni Bissiri,et al.  A general framework for updating belief distributions , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[33]  Sham M. Kakade,et al.  A tail inequality for quadratic forms of subgaussian random vectors , 2011, ArXiv.

[34]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[35]  N. Gozlan A characterization of dimension free concentration in terms of transportation inequalities , 2008, 0804.3089.

[36]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[37]  V. Chernozhukov,et al.  An MCMC Approach to Classical Estimation , 2002, 2301.07782.

[38]  D. Stroock,et al.  Logarithmic Sobolev inequalities and stochastic Ising models , 1987 .

[39]  P. Laplace Memoir on the Probability of the Causes of Events , 1986 .

[40]  Murat A. Erdogdu,et al.  Analysis of Langevin Monte Carlo from Poincare to Log-Sobolev , 2022, COLT.

[41]  E. Candès,et al.  Supplemental Materials for : “ The Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square ” , 2017 .

[42]  D. Blei Bayesian Nonparametrics I , 2016 .

[43]  Van Der Vaart,et al.  The Bernstein-Von-Mises theorem under misspecification , 2012 .

[44]  P. Massart,et al.  Concentration inequalities and model selection , 2007 .