Multilevel approximation of Gaussian random fields: Covariance compression, estimation and spatial prediction

Centered Gaussian random fields (GRFs) indexed by compacta such as smooth, bounded domains in Euclidean space or smooth, compact and orientable manifolds are determined by their covariance operators. We consider centered GRFs given sample-wise as variational solutions to coloring operator equations driven by spatial white noise, with pseudodifferential coloring operator being elliptic, self-adjoint and positive from the Hörmander class. This includes the Matérn class of GRFs as a special case. Using microlocal tools and biorthogonal multiresolution analyses on the manifold, we prove that the precision and covariance operators, respectively, may be identified with bi-infinite matrices and finite sections may be diagonally preconditioned rendering the condition number independent of the dimension p of this section. We prove that a tapering strategy by thresholding as e.g. in [Bickel, P.J. and Levina, E. Covariance regularization by thresholding, Ann. Statist., 36 (2008), 2577–2604] applied on finite sections of the bi-infinite precision and covariance matrices results in optimally numerically sparse approximations. Numerical sparsity signifies that only asymptotically linearly many nonzero matrix entries are sufficient to approximate the original section of the bi-infinite covariance or precision matrix using this tapering strategy to arbitrary precision. This tapering strategy is non-adaptive and the locations of these nonzero matrix entries are known a priori. The tapered covariance or precision matrices may also be optimally diagonal preconditioned. Analysis of the relative size of the entries of the tapered covariance matrices motivates novel, multilevel Monte Carlo (MLMC) oracles for covariance estimation, in sample complexity that scales log-linearly with respect to the number p of parameters. This extends [Bickel, P.J. and Levina, E. Regularized Estimation of Large Covariance Matrices, Ann. Stat., 36 (2008), pp. 199–227] to estimation of (finite sections of) pseudodifferential covariances for GRFs by this fast MLMC method. Assuming at hand sections of the bi-infinite covariance matrix in wavelet coordinates, we propose and analyze a novel compressive algorithm for simulating and kriging of GRFs. The complexity (work and memory vs. accuracy) of these three algorithms scales near-optimally in terms of the number of parameters p of the sample-wise approximation of the GRF in Sobolev scales.

[1]  H. Harbrecht,et al.  Wavelet Galerkin Schemes for 2D-BEM , 2001 .

[2]  Andrew M. Stuart,et al.  Large Data and Zero Noise Limits of Graph-Based Semi-Supervised Learning Algorithms , 2018, Applied and Computational Harmonic Analysis.

[3]  Steffen Lauritzen,et al.  Gaussian Graphical Models , 2018, Handbook of Graphical Models.

[4]  Reinhold Schneider,et al.  Multiskalen- und Wavelet-Matrixkompression , 1998 .

[5]  ψψAABB xxAA,et al.  Markov Random Fields , 1982, Encyclopedia of Social Network Analysis and Mining.

[6]  B. A. Schmitt Perturbation bounds for matrix square roots and pythagorean sums , 1992 .

[7]  Helmut Harbrecht,et al.  The H2-wavelet method , 2014, J. Comput. Appl. Math..

[8]  L. Herrmann,et al.  Multilevel quasi-Monte Carlo integration with product weights for elliptic PDEs with lognormal coefficients , 2019, ESAIM: Mathematical Modelling and Numerical Analysis.

[9]  R. Schneider,et al.  Multiskalen- und Wavelet-Matrixkompression: Analysisbasierte Methoden zur effizienten Lösung großer vollbesetzter Gleichungssysteme , 1995 .

[10]  Wolfgang Dahmen,et al.  Compression Techniques for Boundary Integral Equations - Asymptotically Optimal Complexity Estimates , 2006, SIAM J. Numer. Anal..

[11]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[12]  Tom Fleischer,et al.  Applied Functional Analysis , 2016 .

[13]  Wolfgang Dahmen,et al.  Wavelet approximation methods for pseudodifferential equations II: Matrix compression and fast solution , 1993, Adv. Comput. Math..

[14]  J. R. Wallis,et al.  An Approach to Statistical Spatial-Temporal Modeling of Meteorological Fields , 1994 .

[15]  Sudipto Banerjee,et al.  Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets , 2014, Journal of the American Statistical Association.

[16]  J. Pasciak,et al.  Computer solution of large sparse positive definite systems , 1982 .

[17]  Rob Stevenson,et al.  Finite‐element wavelets on manifolds , 2003 .

[18]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[19]  Houman Owhadi,et al.  Conditioning Gaussian measure on Hilbert space , 2015, 1506.04208.

[20]  Christoph Schwab,et al.  Exponential convergence of hp quadrature for integral operators with Gevrey kernels , 2011 .

[21]  Martin Wainwright,et al.  Inference in High-Dimensional Graphical Models , 2018 .

[22]  N. Cressie,et al.  Fixed rank kriging for very large spatial data sets , 2008 .

[23]  D. Higdon Space and Space-Time Modeling using Process Convolutions , 2002 .

[24]  Jacob K. White,et al.  Multiscale Bases for the Sparse Representation of Boundary Integral Operators on Complex Geometry , 2002, SIAM J. Sci. Comput..

[25]  M. Abramowitz,et al.  Handbook of Mathematical Functions, with Formulas, Graphs, and Mathematical Tables , 1966 .

[26]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[27]  Christoph Schwab,et al.  Multilevel approximation of Gaussian random fields: Fast simulation , 2019 .

[28]  Matthias Katzfuss,et al.  A Multi-Resolution Approximation for Massive Spatial Datasets , 2015, 1507.04789.

[29]  Wolfgang Dahmen,et al.  Wavelets on Manifolds I: Construction and Domain Decomposition , 1999, SIAM J. Math. Anal..

[30]  D. Nychka,et al.  Covariance Tapering for Interpolation of Large Spatial Datasets , 2006 .

[31]  L. Hörmander The Analysis of Linear Partial Differential Operators III , 2007 .

[32]  F. Lindgren,et al.  Spatial models generated by nested stochastic partial differential equations, with an application to global ozone mapping , 2011, 1104.3436.

[33]  F. Lutscher Spatial Variation , 2019, Interdisciplinary Applied Mathematics.

[34]  Y. Meyer Opérateurs de Calderón-Zygmund , 1990 .

[35]  I. Daubechies,et al.  Biorthogonal bases of compactly supported wavelets , 1992 .

[36]  Robert T. Seeley,et al.  Complex powers of an elliptic operator , 1967 .

[37]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[38]  David Bolin,et al.  The Rational SPDE Approach for Gaussian Random Fields With General Smoothness , 2017, Journal of Computational and Graphical Statistics.

[39]  David Bolin,et al.  Numerical solution of fractional elliptic stochastic PDEs with spatial white noise , 2017, IMA Journal of Numerical Analysis.

[40]  C. Schwab,et al.  Numerical analysis of lognormal diffusions on the sphere , 2016, Stochastics and Partial Differential Equations: Analysis and Computations.

[41]  Rob Stevenson,et al.  Finite element wavelets with improved quantitative properties , 2009 .

[42]  L. Saulis,et al.  Limit theorems for large deviations , 1991 .

[43]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[44]  M. Czubak,et al.  PSEUDODIFFERENTIAL OPERATORS , 2020, Introduction to Partial Differential Equations.

[45]  L. Hörmander The analysis of linear partial differential operators , 1990 .

[46]  Thierry Aubin,et al.  Some Nonlinear Problems in Riemannian Geometry , 1998 .

[47]  Rob Stevenson,et al.  A quadratic finite element wavelet Riesz basis , 2018, Int. J. Wavelets Multiresolution Inf. Process..

[48]  Joseph J. Kohn,et al.  An algebra of pseudo‐differential operators , 1965 .

[49]  Helmut Harbrecht,et al.  Covariance regularity and H\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {H}$$\end{document}-matrix approxi , 2014, Numerische Mathematik.

[50]  Reinhold Schneider,et al.  Biorthogonal wavelet bases for the boundary element method , 2004 .

[51]  Paul Krée,et al.  Pseudo-differential operators and Gevrey classes , 1967 .

[52]  S. Geer,et al.  Inference in High-Dimensional Graphical Models , 2018, Handbook of Graphical Models.

[53]  N. Higham,et al.  Computing A, log(A) and Related Matrix Functions by Contour Integrals , 2007 .

[54]  Helmut Harbrecht,et al.  A fast direct solver for nonlocal operators in wavelet coordinates , 2020, J. Comput. Phys..

[55]  Adam J. Rothman,et al.  Sparse permutation invariant covariance estimation , 2008, 0801.4837.

[56]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields; The SPDE approach , 2010 .

[57]  W. Hackbusch,et al.  Hierarchical Matrices: Algorithms and Analysis , 2015 .

[58]  Robert Scheichl,et al.  Finite Element Error Analysis of Elliptic PDEs with Random Coefficients and Its Application to Multilevel Monte Carlo Methods , 2013, SIAM J. Numer. Anal..

[59]  James A. Nichols,et al.  Quasi-Monte Carlo finite element methods for elliptic PDEs with lognormal random coefficients , 2015, Numerische Mathematik.

[60]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , 2011 .

[61]  Kristin Kirchner,et al.  Regularity and convergence analysis in Sobolev and Hölder spaces for generalized Whittle–Matérn fields , 2019, Numerische Mathematik.

[62]  Adam J. Rothman,et al.  A new approach to Cholesky-based covariance regularization in high dimensions , 2009, 0903.0645.

[63]  D. Nychka,et al.  A Multiresolution Gaussian Process Model for the Analysis of Large Spatial Datasets , 2015 .

[64]  Dorit Hammerling,et al.  A Case Study Competition Among Methods for Analyzing Large Spatial Data , 2017, Journal of Agricultural, Biological and Environmental Statistics.