Approximation and sampling of multivariate probability distributions in the tensor train decomposition

General multivariate distributions are notoriously expensive to sample from, particularly the high-dimensional posterior distributions in PDE-constrained inverse problems. This paper develops a sampler for arbitrary continuous multivariate distributions that is based on low-rank surrogates in the tensor train format, a methodology that has been exploited for many years for scalable, high-dimensional density function approximation in quantum physics and chemistry. We build upon recent developments of the cross approximation algorithms in linear algebra to construct a tensor train approximation to the target probability density function using a small number of function evaluations. For sufficiently smooth distributions, the storage required for accurate tensor train approximations is moderate, scaling linearly with dimension. In turn, the structure of the tensor train surrogate allows sampling by an efficient conditional distribution method since marginal distributions are computable with linear complexity in dimension. Expected values of non-smooth quantities of interest, with respect to the surrogate distribution, can be estimated using transformed independent uniformly-random seeds that provide Monte Carlo quadrature or transformed points from a quasi-Monte Carlo lattice to give more efficient quasi-Monte Carlo quadrature. Unbiased estimates may be calculated by correcting the transformed random seeds using a Metropolis–Hastings accept/reject step, while the quasi-Monte Carlo quadrature may be corrected either by a control-variate strategy or by importance weighting. We show that the error in the tensor train approximation propagates linearly into the Metropolis–Hastings rejection rate and the integrated autocorrelation time of the resulting Markov chain; thus, the integrated autocorrelation time may be made arbitrarily close to 1, implying that, asymptotic in sample size, the cost per effectively independent sample is one target density evaluation plus the cheap tensor train surrogate proposal that has linear cost with dimension. These methods are demonstrated in three computed examples: fitting failure time of shock absorbers; a PDE-constrained inverse diffusion problem; and sampling from the Rosenbrock distribution. The delayed rejection adaptive Metropolis (DRAM) algorithm is used as a benchmark. In all computed examples, the importance weight-corrected quasi-Monte Carlo quadrature performs best and is more efficient than DRAM by orders of magnitude across a wide range of approximation accuracies and sample sizes. Indeed, all the methods developed here significantly outperform DRAM in all computed examples.

[1]  Andrew M. Stuart,et al.  Inverse problems: A Bayesian perspective , 2010, Acta Numerica.

[2]  Renate Meyer,et al.  Adaptive rejection Metropolis sampling using Lagrange interpolation polynomials of degree 2 , 2008, Comput. Stat. Data Anal..

[3]  M. Rosenblatt Remarks on a Multivariate Transformation , 1952 .

[4]  P. O'Connor,et al.  Practical Reliability Engineering , 1981 .

[5]  J. Andrés Christen,et al.  A practical method for obtaining prior distributions in reliability , 2005, IEEE Transactions on Reliability.

[6]  Claude Jeffrey Gittelson,et al.  Adaptive stochastic Galerkin FEM , 2014 .

[7]  By W. R. GILKSt,et al.  Adaptive Rejection Sampling for Gibbs Sampling , 2010 .

[8]  Frances Y. Kuo,et al.  Quasi-Monte Carlo methods for elliptic PDEs with random coefficients and applications , 2011, J. Comput. Phys..

[9]  E. Tyrtyshnikov,et al.  TT-cross approximation for multidimensional arrays , 2010 .

[10]  Y. Atchadé An Adaptive Version for the Metropolis Adjusted Langevin Algorithm with a Truncated Drift , 2006 .

[11]  Frances Y. Kuo,et al.  Multilevel Quasi-Monte Carlo methods for lognormal diffusion problems , 2015, Math. Comput..

[12]  S. White Density matrix renormalization group algorithms with a single center site , 2005, cond-mat/0508709.

[13]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[14]  S. Goreinov,et al.  A Theory of Pseudoskeleton Approximations , 1997 .

[15]  S. V. Dolgov,et al.  ALTERNATING MINIMAL ENERGY METHODS FOR LINEAR SYSTEMS IN HIGHER DIMENSIONS∗ , 2014 .

[16]  C. Fox,et al.  A general purpose sampling algorithm for continuous distributions (the t-walk) , 2010 .

[17]  Ulli Wolff,et al.  Monte Carlo errors with less errors , 2004 .

[18]  Michael B. Giles,et al.  Multilevel quasi-Monte Carlo path simulation , 2009 .

[19]  S. Goreinov,et al.  How to find a good submatrix , 2010 .

[20]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[21]  A. M. Stuart,et al.  Quasi-Monte Carlo and Multilevel Monte Carlo Methods for Computing Posterior Expectations in Elliptic Inverse Problems , 2016, SIAM/ASA J. Uncertain. Quantification.

[22]  Ivan V. Oseledets,et al.  DMRG Approach to Fast Linear Algebra in the TT-Format , 2011, Comput. Methods Appl. Math..

[23]  Mark E. Johnson,et al.  Multivariate Statistical Simulation , 1989, International Encyclopedia of Statistical Science.

[24]  J. Andrés Christen,et al.  Sampling hyperparameters in hierarchical models: Improving on Gibbs for high-dimensional latent fields and large datasets , 2018, Commun. Stat. Simul. Comput..

[25]  Ivan Oseledets,et al.  Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..

[26]  H. Niederreiter Quasi-Monte Carlo methods and pseudo-random numbers , 1978 .

[27]  Heikki Haario,et al.  DRAM: Efficient adaptive MCMC , 2006, Stat. Comput..

[28]  Robert Scheichl,et al.  A Hybrid Alternating Least Squares-TT-Cross Algorithm for Parametric PDEs , 2017, SIAM/ASA J. Uncertain. Quantification.

[29]  R. Schneider,et al.  Sampling-free Bayesian inversion with adaptive hierarchical tensor representations , 2018 .

[30]  Luca Martino,et al.  Independent Doubly Adaptive Rejection Metropolis Sampling Within Gibbs Sampling , 2012, IEEE Transactions on Signal Processing.

[31]  Wolfgang Hörmann,et al.  Automatic Nonuniform Random Variate Generation , 2011 .

[32]  Colin Fox,et al.  Fast Sampling in a Linear-Gaussian Inverse Problem , 2015, SIAM/ASA J. Uncertain. Quantification.

[33]  H. Rue Fast sampling of Gaussian Markov random fields , 2000 .

[34]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[35]  Mark E. Johnson Multivariate Statistical Simulation: Johnson/Multivariate , 1987 .

[36]  I. Oseledets Constructive Representation of Functions in Low-Rank Tensor Formats , 2012, Constructive Approximation.

[37]  E. Tyrtyshnikov Tensor approximations of matrices generated by asymptotically smooth functions , 2003 .

[38]  Jun S. Liu,et al.  Metropolized independent sampling with comparisons to rejection sampling and importance sampling , 1996, Stat. Comput..

[39]  B. Khoromskij Structured Rank-(r1, . . . , rd) Decomposition of Function-related Tensors in R_D , 2006 .

[40]  R. Tweedie,et al.  Rates of convergence of the Hastings and Metropolis algorithms , 1996 .

[41]  Andrew M. Stuart,et al.  Complexity analysis of accelerated MCMC methods for Bayesian inversion , 2012, 1207.2411.

[42]  R. Schneider,et al.  Sampling-free Bayesian inversion with a hierarchical tensor representation , 2017 .

[43]  J. Bert Keats,et al.  Statistical Methods for Reliability Data , 1999 .

[44]  Antonietta Mira,et al.  Ordering and Improving the Performance of Monte Carlo Markov Chains , 2001 .

[45]  Frances Y. Kuo,et al.  High-dimensional integration: The quasi-Monte Carlo way*† , 2013, Acta Numerica.

[46]  L. Tierney A note on Metropolis-Hastings kernels for general state spaces , 1998 .

[47]  Reinhold Schneider,et al.  Approximation rates for the hierarchical tensor format in periodic Sobolev spaces , 2014, J. Complex..

[48]  L. Tierney,et al.  Exact Transition Probabilities For The Independence Metropolis Sampler , 1996 .

[49]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[50]  Lars Grasedyck,et al.  Hierarchical Tensor Approximation of Output Quantities of Parameter-Dependent PDEs , 2015, SIAM/ASA J. Uncertain. Quantification.

[51]  Frances Y. Kuo,et al.  Higher Order QMC Petrov-Galerkin Discretization for Affine Parametric Operator Equations with Random Field Inputs , 2014, SIAM J. Numer. Anal..

[52]  J. Coyle Inverse Problems , 2004 .

[53]  Elisabeth Ullmann,et al.  Further analysis of multilevel Monte Carlo methods for elliptic PDEs with random coefficients , 2012, Numerische Mathematik.

[54]  Geoff K. Nicholls,et al.  Sampling Conductivity Images via MCMC , 2007 .

[55]  C. Fox,et al.  Accelerated Gibbs sampling of normal distributions using matrix splittings and polynomials , 2015, 1505.03512.

[56]  T. J. Dodwell,et al.  A Hierarchical Multilevel Markov Chain Monte Carlo Algorithm with Applications to Uncertainty Quantification in Subsurface Flow , 2013, SIAM/ASA J. Uncertain. Quantification.

[57]  Olle Häggström,et al.  On Variance Conditions for Markov Chain CLTs , 2007 .

[58]  Gareth O. Roberts,et al.  Quantitative Non-Geometric Convergence Bounds for Independence Samplers , 2011 .