Review on statistical methods for large spatial Gaussian data

The Gaussian geostatistical model has been widely used for modeling spatial data. However, this model suffers from a severe difficulty in computation because inference requires to invert a large covariance matrix in evaluating log-likelihood. In addressing this computational challenge, three strategies have been employed: likelihood approximation, lower dimensional space approximation, and Markov random field approximation. In this paper, we reviewed statistical approaches attacking the computational challenge. As an illustration, we also applied integrated nested Laplace approximation (INLA) technology, one of Markov approximation approach, to real data to provide an example of its use in practice dealing with large spatial data.

[1]  Debashis Mondal,et al.  First-order intrinsic autoregressions and the de Wijs process , 2005 .

[2]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , 2011 .

[3]  L. Held,et al.  Bayesian analysis of measurement error models using integrated nested Laplace approximations , 2015 .

[4]  M. Fuentes Approximate Likelihood for Large Irregularly Spaced Spatial Data , 2007, Journal of the American Statistical Association.

[5]  Christopher J. Paciorek,et al.  Computational techniques for spatial logistic regression with large data sets , 2007, Comput. Stat. Data Anal..

[6]  H. Rue,et al.  Fitting Gaussian Markov Random Fields to Gaussian Fields , 2002 .

[7]  Faming Liang,et al.  Bayesian Analysis of Geostatistical Models With an Auxiliary Lattice , 2012 .

[8]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[9]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[10]  Xiwu Lin,et al.  Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV , 2000 .

[11]  D. Nychka,et al.  Covariance Tapering for Interpolation of Large Spatial Datasets , 2006 .

[12]  Ola Hössjer,et al.  Fast kriging of large data sets with Gaussian Markov random fields , 2008, Comput. Stat. Data Anal..

[13]  Zhiyi Chi,et al.  Approximating likelihoods for large spatial data sets , 2004 .

[14]  A. V. Vecchia Estimation and model identification for continuous spatial processes , 1988 .

[15]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[16]  Gianluca Baio,et al.  Spatial and spatio-temporal models with R-INLA. , 2013, Spatial and spatio-temporal epidemiology.

[17]  N. Cressie,et al.  A dimension-reduced approach to space-time Kalman filtering , 1999 .

[18]  Michael L. Stein,et al.  Interpolation of spatial data , 1999 .

[19]  Richard H. Jones,et al.  Models for Continuous Stationary Space-Time Processes , 1997 .

[20]  Douglas W. Nychka,et al.  Covariance Tapering for Likelihood-Based Estimation in Large Spatial Data Sets , 2008 .