A class of multi-resolution approximations for large spatial datasets

Gaussian processes are popular and flexible models for spatial, temporal, and functional data, but they are computationally infeasible for large datasets. We discuss Gaussian-process approximations that use basis functions at multiple resolutions to achieve fast inference and that can (approximately) represent any spatial covariance structure. We consider two special cases of this multi-resolution-approximation framework, a taper version and a domain-partitioning (block) version. We describe theoretical properties and inference procedures, and study the computational complexity of the methods. Numerical comparisons and an application to satellite data are also provided.

[1]  Zoubin Ghahramani,et al.  Local and global sparse Gaussian process approximations , 2007, AISTATS.

[2]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , 2011 .

[3]  Dorit Hammerling,et al.  Parallel inference for massive distributed spatial data using low-rank models , 2017, Stat. Comput..

[4]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[5]  N. Cressie,et al.  A dimension-reduced approach to space-time Kalman filtering , 1999 .

[6]  J. Møller,et al.  Handbook of Spatial Statistics , 2008 .

[7]  Matthias Katzfuss,et al.  A Multi-Resolution Approximation for Massive Spatial Datasets , 2015, 1507.04789.

[8]  Michael L. Stein,et al.  2010 Rietz lecture: When does the screening effect hold? , 2011, 1203.1801.

[9]  Leslie Greengard,et al.  Fast Direct Methods for Gaussian Processes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Douglas W. Nychka,et al.  Covariance Tapering for Likelihood-Based Estimation in Large Spatial Data Sets , 2008 .

[11]  F. J. Alonso,et al.  The Kriged Kalman filter , 1998 .

[12]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[13]  Jianhua Z. Huang,et al.  Covariance approximation for large multivariate spatial data sets with an application to multiple climate model errors , 2011, 1203.0133.

[14]  Matthias Katzfuss,et al.  Bayesian nonstationary spatial modeling for very large datasets , 2012, 1204.2098.

[15]  Jianhua Z. Huang,et al.  A full scale approximation of covariance functions for large spatial data sets , 2012 .

[16]  T. Gneiting Compactly Supported Correlation Functions , 2002 .

[17]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[18]  Douglas W. Nychka,et al.  Methods for Analyzing Large Spatial Data: A Review and Comparison , 2017 .

[19]  M. Katzfuss,et al.  A General Framework for Vecchia Approximations of Gaussian Processes , 2017, 1708.06302.

[20]  Matthias Katzfuss,et al.  Spatio‐temporal smoothing and EM estimation for massive remote‐sensing data sets , 2011 .

[21]  A. V. Vecchia Estimation and model identification for continuous spatial processes , 1988 .

[22]  W. F. Tinney,et al.  On computing certain elements of the inverse of a sparse matrix , 1975, Commun. ACM.

[23]  Pravin M. Vaidya,et al.  AnO(n logn) algorithm for the all-nearest-neighbors Problem , 1989, Discret. Comput. Geom..

[24]  Amara Lynn Graps,et al.  An introduction to wavelets , 1995 .

[25]  N. Cressie,et al.  Fixed rank kriging for very large spatial data sets , 2008 .

[26]  Eric Darve,et al.  Computing entries of the inverse of a sparse matrix using the FIND algorithm , 2008, J. Comput. Phys..

[27]  Zhiyi Chi,et al.  Approximating likelihoods for large spatial data sets , 2004 .

[28]  Gardar Johannesson,et al.  Dynamic multi-resolution spatial models , 2007, Environmental and Ecological Statistics.

[29]  M. Kanter Unimodal spectral windows , 1997 .

[30]  Sudipto Banerjee,et al.  Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets , 2014, Journal of the American Statistical Association.

[31]  Dorit Hammerling,et al.  A Case Study Competition Among Methods for Analyzing Large Spatial Data , 2017, Journal of Agricultural, Biological and Environmental Statistics.

[32]  David Higdon,et al.  A process-convolution approach to modelling temperatures in the North Atlantic Ocean , 1998, Environmental and Ecological Statistics.

[33]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[34]  J. Andrew Royle,et al.  Multiresolution models for nonstationary spatial covariance functions , 2002 .

[35]  A. I. McLeod,et al.  Algorithms for Linear Time Series Analysis: With R Package , 2007 .

[36]  D. Nychka,et al.  Covariance Tapering for Interpolation of Large Spatial Datasets , 2006 .

[37]  Lexing Ying,et al.  SelInv---An Algorithm for Selected Inversion of a Sparse Symmetric Matrix , 2011, TOMS.

[38]  N. Cressie,et al.  Bayesian hierarchical spatio‐temporal smoothing for very large datasets , 2012 .

[39]  D. Nychka,et al.  A Multiresolution Gaussian Process Model for the Analysis of Large Spatial Datasets , 2015 .