Deep Composition of Tensor Trains using Squared Inverse Rosenblatt Transports

Characterising intractable high-dimensional random variables is one of the fundamental challenges in stochastic computation. The recent surge of transport maps offers a mathematical foundation and new insights for tackling this challenge by coupling intractable random variables with tractable reference random variables. This paper generalises a recently developed functional tensor-train (FTT) approximation of the inverse Rosenblatt transport [14] to a wide class of high-dimensional nonnegative functions, such as unnormalised probability density functions. First, we extend the inverse Rosenblatt transform to enable the transport to general reference measures other than the uniform measure. We develop an efficient procedure to compute this transport from a squared FTT decomposition which preserves the monotonicity. More crucially, we integrate the proposed monotonicity-preserving FTT transport into a nested variable transformation framework inspired by deep neural networks. The resulting deep inverse Rosenblatt transport significantly expands the capability of tensor approximations and transport maps to random variables with complicated nonlinear interactions and concentrated density functions. We demonstrate the efficacy of the proposed approach on a range of applications in statistical learning and uncertainty quantification, including parameter estimation for dynamical systems and inverse problems constrained by partial differential equations.

[1]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[2]  Danny C. Sorensen,et al.  Nonlinear Model Reduction via Discrete Empirical Interpolation , 2010, SIAM J. Sci. Comput..

[3]  Ivan V. Oseledets,et al.  DMRG Approach to Fast Linear Algebra in the TT-Format , 2011, Comput. Methods Appl. Math..

[4]  Radford M. Neal Sampling from multimodal distributions using tempered transitions , 1996, Stat. Comput..

[5]  H. Bungartz,et al.  Sparse grids , 2004, Acta Numerica.

[6]  Y. Marzouk,et al.  Greedy inference with layers of lazy maps , 2019, 1906.00031.

[7]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[8]  Wang,et al.  Replica Monte Carlo simulation of spin glasses. , 1986, Physical review letters.

[9]  Eric Nalisnick,et al.  Normalizing Flows for Probabilistic Modeling and Inference , 2019, J. Mach. Learn. Res..

[10]  Alexandros Beskos,et al.  Sequential Monte Carlo Methods for High-Dimensional Inverse Problems: A Case Study for the Navier-Stokes Equations , 2013, SIAM/ASA J. Uncertain. Quantification.

[11]  S. V. Dolgov,et al.  ALTERNATING MINIMAL ENERGY METHODS FOR LINEAR SYSTEMS IN HIGHER DIMENSIONS∗ , 2014 .

[12]  Tiangang Cui,et al.  Data‐driven model reduction for the Bayesian solution of inverse problems , 2014, 1403.4290.

[13]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[14]  Petros Drineas,et al.  CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[15]  Ivan Kobyzev,et al.  Normalizing Flows: An Introduction and Review of Current Methods , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Sertac Karaman,et al.  A continuous analogue of the tensor-train decomposition , 2015, Computer Methods in Applied Mechanics and Engineering.

[17]  W. Hackbusch Tensor Spaces and Numerical Tensor Calculus , 2012, Springer Series in Computational Mathematics.

[18]  Jie Shen,et al.  Spectral Methods: Algorithms, Analysis and Applications , 2011 .

[19]  Tiangang Cui,et al.  Scalable posterior approximations for large-scale Bayesian inverse problems via likelihood-informed parameter and state reduction , 2015, J. Comput. Phys..

[20]  R. Tweedie,et al.  Rates of convergence of the Hastings and Metropolis algorithms , 1996 .

[21]  White,et al.  Density-matrix algorithms for quantum renormalization groups. , 1993, Physical review. B, Condensed matter.

[22]  C. Villani Optimal Transport: Old and New , 2008 .

[23]  Ivan Oseledets,et al.  Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..

[24]  Guillaume Carlier,et al.  From Knothe's Transport to Brenier's Map and a Continuation Method for Optimal Transport , 2008, SIAM J. Math. Anal..

[25]  K. Hukushima,et al.  Exchange Monte Carlo Method and Application to Spin Glass Simulations , 1995, cond-mat/9512035.

[26]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[27]  Xiao-Li Meng,et al.  Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling , 1998 .

[28]  M. Rosenblatt Remarks on a Multivariate Transformation , 1952 .

[29]  S. Goreinov,et al.  How to find a good submatrix , 2010 .

[30]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[31]  Robert Scheichl,et al.  Rank Bounds for Approximating Gaussian Densities in the Tensor-Train Format , 2020, SIAM/ASA J. Uncertain. Quantification.

[32]  Reinhold Schneider,et al.  The Alternating Linear Scheme for Tensor Optimization in the Tensor Train Format , 2012, SIAM J. Sci. Comput..

[33]  Youssef M. Marzouk,et al.  Spectral Tensor-Train Decomposition , 2014, SIAM J. Sci. Comput..

[34]  Tiangang Cui,et al.  Optimal Low-rank Approximations of Bayesian Linear Inverse Problems , 2014, SIAM J. Sci. Comput..

[35]  S. Goreinov,et al.  A Theory of Pseudoskeleton Approximations , 1997 .

[36]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[37]  E. Tyrtyshnikov,et al.  TT-cross approximation for multidimensional arrays , 2010 .

[38]  N. Nguyen,et al.  An ‘empirical interpolation’ method: application to efficient reduced-basis discretization of partial differential equations , 2004 .

[39]  Ullrich Köthe,et al.  HINT: Hierarchical Invertible Neural Transport for Density Estimation and Bayesian Inference , 2019 .

[40]  Severnyi Kavkaz Pseudo-Skeleton Approximations by Matrices of Maximal Volume , 2022 .

[41]  Tiangang Cui,et al.  Certified dimension reduction in nonlinear Bayesian inverse problems , 2018, Math. Comput..

[42]  W. Förstner,et al.  A Metric for Covariance Matrices , 2003 .

[43]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[44]  Youssef Marzouk,et al.  Transport Map Accelerated Markov Chain Monte Carlo , 2014, SIAM/ASA J. Uncertain. Quantification.

[45]  Ivan V. Oseledets,et al.  Rectangular maximum-volume submatrices and their applications , 2015, ArXiv.

[46]  Heikki Haario,et al.  DRAM: Efficient adaptive MCMC , 2006, Stat. Comput..

[47]  Colin Fox,et al.  Approximation and sampling of multivariate probability distributions in the tensor train decomposition , 2018, Statistics and Computing.

[48]  Xiao-Li Meng,et al.  SIMULATING RATIOS OF NORMALIZING CONSTANTS VIA A SIMPLE IDENTITY: A THEORETICAL EXPLORATION , 1996 .

[49]  Tiangang Cui,et al.  A Stein variational Newton method , 2018, NeurIPS.

[50]  Benjamin Peherstorfer,et al.  A transport-based multifidelity preconditioner for Markov chain Monte Carlo , 2018, Advances in Computational Mathematics.

[51]  Frances Y. Kuo,et al.  High-dimensional integration: The quasi-Monte Carlo way*† , 2013, Acta Numerica.

[52]  Ivan Kobyzev,et al.  Normalizing Flows: Introduction and Ideas , 2019, ArXiv.

[53]  Youssef Marzouk,et al.  Greedy inference with structure-exploiting lazy maps , 2020, NeurIPS.

[54]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[55]  Tiangang Cui,et al.  Likelihood-informed dimension reduction for nonlinear inverse problems , 2014, 1403.4680.

[56]  A. M. Stuart,et al.  Quasi-Monte Carlo and Multilevel Monte Carlo Methods for Computing Posterior Expectations in Elliptic Inverse Problems , 2016, SIAM/ASA J. Uncertain. Quantification.

[57]  Youssef M. Marzouk,et al.  Inference via Low-Dimensional Couplings , 2017, J. Mach. Learn. Res..

[58]  Youssef M. Marzouk,et al.  Bayesian inference with optimal maps , 2011, J. Comput. Phys..

[59]  D. Higdon Space and Space-Time Modeling using Process Convolutions , 2002 .

[60]  H. Knothe Contributions to the theory of convex bodies. , 1957 .