A unified performance analysis of likelihood-informed subspace methods

The likelihood-informed subspace (LIS) method offers a viable route to reducing the dimensionality of highdimensional probability distributions arising in Bayesian inference. LIS identifies an intrinsic low-dimensional linear subspace where the target distribution differs the most from some tractable reference distribution. Such a subspace can be identified using the leading eigenvectors of a Gram matrix of the gradient of the log-likelihood function. Then, the original high-dimensional target distribution is approximated through various forms of marginalization of the likelihood function, in which the approximated likelihood only has support on the intrinsic lowdimensional subspace. This approximation enables the design of inference algorithms that can scale sub-linearly with the apparent dimensionality of the problem. Intuitively, the accuracy of the approximation, and hence the performance of the inference algorithms, are influenced by three factors—the dimension truncation error in identifying the subspace, Monte Carlo error in estimating the Gram matrices, and Monte Carlo error in constructing marginalizations. This work establishes a unified framework to analyze each of these three factors and their interplay. Under mild technical assumptions, we establish error bounds for a range of existing dimension reduction techniques based on the principle of LIS. Our error bounds also provide useful insights into the accuracy of these methods. In addition, we analyze the integration of LIS with sampling methods such as Markov Chain Monte Carlo (MCMC) and sequential Monte Carlo (SMC). We also demonstrate the applicability of our analysis on a linear inverse problem with Gaussian prior, which shows that all the estimates can be dimension-independent if the prior covariance is a trace-class operator. Finally, we demonstrate various aspects of our theoretical claims on two nonlinear inverse problems.

[1]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[2]  Y. Marzouk,et al.  Greedy inference with layers of lazy maps , 2019, 1906.00031.

[3]  Andrew M. Stuart,et al.  Geometric MCMC for infinite-dimensional inverse problems , 2016, J. Comput. Phys..

[4]  S. Bobkov,et al.  From Brunn-Minkowski to Brascamp-Lieb and to logarithmic Sobolev inequalities , 2000 .

[5]  G. Stewart The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators , 1980 .

[6]  M. Ledoux,et al.  Logarithmic Sobolev Inequalities , 2014 .

[7]  Tiangang Cui,et al.  A Stein variational Newton method , 2018, NeurIPS.

[8]  T. Sullivan Introduction to Uncertainty Quantification , 2015 .

[9]  C. Villani,et al.  Generalization of an Inequality by Talagrand and Links with the Logarithmic Sobolev Inequality , 2000 .

[10]  Tiangang Cui,et al.  Certified dimension reduction in nonlinear Bayesian inverse problems , 2018, Math. Comput..

[11]  Matthias Morzfeld,et al.  MALA-within-Gibbs Samplers for High-Dimensional Distributions with Sparse Conditional Structure , 2020, SIAM J. Sci. Comput..

[12]  Andrew M. Stuart,et al.  Uncertainty Quantification and Weak Approximation of an Elliptic Inverse Problem , 2011, SIAM J. Numer. Anal..

[13]  Youssef M. Marzouk,et al.  Inference via Low-Dimensional Couplings , 2017, J. Mach. Learn. Res..

[14]  G. Roberts,et al.  Unbiased Monte Carlo: Posterior estimation for intractable/infinite-dimensional models , 2014, Bernoulli.

[15]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[16]  Bart G. van Bloemen Waanders,et al.  Fast Algorithms for Bayesian Uncertainty Quantification in Large-Scale Linear Inverse Problems Based on Low-Rank Partial Hessian Approximations , 2011, SIAM J. Sci. Comput..

[17]  T. J. Sullivan,et al.  Error bounds for some approximate posterior measures in Bayesian inference , 2019, ArXiv.

[18]  Ilse C. F. Ipsen,et al.  Low-Rank Matrix Approximations Do Not Need a Singular Value Gap , 2018, SIAM J. Matrix Anal. Appl..

[19]  Esteban G. Tabak,et al.  Data‐Driven Optimal Transport , 2016 .

[20]  H. Haario,et al.  An adaptive Metropolis algorithm , 2001 .

[21]  Jonas Wallin,et al.  Generalized bounds for active subspaces , 2019, Electronic Journal of Statistics.

[22]  Esteban G. Tabak,et al.  Conditional density estimation and simulation through optimal transport , 2020, Machine Learning.

[23]  S. Bobkov,et al.  Weighted poincaré-type inequalities for cauchy and other convex measures , 2009, 0906.1651.

[24]  O. Papaspiliopoulos,et al.  Importance Sampling: Intrinsic Dimension and Computational Cost , 2015, 1511.06196.

[25]  Georg Stadler,et al.  Extreme-scale UQ for Bayesian inverse problems governed by PDEs , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[26]  Yan Zhou,et al.  Multilevel Sequential Monte Carlo with Dimension-Independent Likelihood-Informed Proposals , 2017, SIAM/ASA J. Uncertain. Quantification.

[27]  Daniel Sanz-Alonso,et al.  Importance Sampling and Necessary Sample Size: An Information Theory Approach , 2016, SIAM/ASA J. Uncertain. Quantification.

[28]  S. Bobkov Isoperimetric and Analytic Inequalities for Log-Concave Probability Measures , 1999 .

[29]  C. Andrieu,et al.  The pseudo-marginal approach for efficient Monte Carlo computations , 2009, 0903.5480.

[30]  H. Haario,et al.  Markov chain Monte Carlo methods for high dimensional inversion in remote sensing , 2004 .

[31]  E. Somersalo,et al.  Statistical inversion and Monte Carlo sampling methods in electrical impedance tomography , 2000 .

[32]  Daniel Rudolf,et al.  On a Generalization of the Preconditioned Crank–Nicolson Metropolis Algorithm , 2015, Found. Comput. Math..

[33]  Ryan P. Adams,et al.  The Gaussian Process Density Sampler , 2008, NIPS.

[34]  E. Lieb,et al.  On extensions of the Brunn-Minkowski and Prékopa-Leindler theorems, including inequalities for log concave functions, and with an application to the diffusion equation , 1976 .

[35]  S. Bobkov,et al.  Poincaré’s inequalities and Talagrand’s concentration phenomenon for the exponential distribution , 1997 .

[36]  Joel L. Horowitz,et al.  Methodology and convergence rates for functional linear regression , 2007, 0708.0466.

[37]  James Martin,et al.  A Computational Framework for Infinite-Dimensional Bayesian Inverse Problems, Part II: Stochastic Newton MCMC with Application to Ice Sheet Flow Inverse Problems , 2013, SIAM J. Sci. Comput..

[38]  Tiangang Cui,et al.  Bayesian calibration of a large‐scale geothermal reservoir model by a new adaptive delayed acceptance Metropolis Hastings algorithm , 2011 .

[39]  Tosio Kato A Short Introduction to Perturbation Theory for Linear Operators , 1982 .

[40]  Robert Scheichl,et al.  Multilevel Markov Chain Monte Carlo , 2019, SIAM Rev..

[41]  Pierre Flener Introduction to Uncertainty Quantification (UQ) , 2015 .

[42]  James Martin,et al.  A Stochastic Newton MCMC Method for Large-Scale Statistical Inverse Problems with Application to Seismic Inversion , 2012, SIAM J. Sci. Comput..

[43]  James Martin,et al.  A Computational Framework for Infinite-Dimensional Bayesian Inverse Problems Part I: The Linearized Case, with Application to Global Seismic Inversion , 2013, SIAM J. Sci. Comput..

[44]  M. Dashti,et al.  Rates of contraction of posterior distributions based on p-exponential priors , 2018, Bernoulli.

[45]  G. Roberts,et al.  MCMC Methods for Functions: ModifyingOld Algorithms to Make Them Faster , 2012, 1202.0709.

[46]  A. Beskos,et al.  On the stability of sequential Monte Carlo methods in high dimensions , 2011, 1103.3965.

[47]  Tiangang Cui,et al.  Data-free likelihood-informed dimension reduction of Bayesian inverse problems , 2021, ArXiv.

[48]  G. Menz,et al.  Poincaré and logarithmic Sobolev inequalities by decomposition of the energy landscape , 2012, 1202.1510.

[49]  Andrew M. Stuart,et al.  Inverse problems: A Bayesian perspective , 2010, Acta Numerica.

[50]  M. Ledoux A simple analytic proof of an inequality by P. Buser , 1994 .

[51]  Marco A. Iglesias,et al.  Well-posed Bayesian geometric inverse problems arising in subsurface flow , 2014, 1401.5571.

[52]  Tiangang Cui,et al.  Deep Composition of Tensor-Trains Using Squared Inverse Rosenblatt Transports , 2020, Foundations of Computational Mathematics.

[53]  Tiangang Cui,et al.  Optimal Low-rank Approximations of Bayesian Linear Inverse Problems , 2014, SIAM J. Sci. Comput..

[54]  Matthias Morzfeld,et al.  Localization for MCMC: sampling high-dimensional posterior distributions with local structure , 2017, J. Comput. Phys..

[55]  Themistoklis P. Sapsis,et al.  Probabilistic Description of Extreme Events in Intermittently Unstable Dynamical Systems Excited by Correlated Stochastic Processes , 2014, SIAM/ASA J. Uncertain. Quantification.

[56]  Tengyao Wang,et al.  A useful variant of the Davis--Kahan theorem for statisticians , 2014, 1405.0680.

[57]  C. Andrieu,et al.  Convergence properties of pseudo-marginal Markov chain Monte Carlo algorithms , 2012, 1210.1484.

[58]  E. Tabak,et al.  A Family of Nonparametric Density Estimation Algorithms , 2013 .

[59]  Kari Karhunen,et al.  Über lineare Methoden in der Wahrscheinlichkeitsrechnung , 1947 .

[60]  Ton Steerneman,et al.  ON THE TOTAL VARIATION AND HELLINGER DISTANCE BETWEEN SIGNED MEASURES - AN APPLICATION TO PRODUCT MEASURES , 1983 .

[61]  C. Villani,et al.  Weighted Csiszár-Kullback-Pinsker inequalities and applications to transportation inequalities , 2005 .