Reduced Basis Kriging for Big Spatial Fields

In spatial statistics, a common method for prediction over a Gaussian random field (GRF) is maximum likelihood estimation combined with kriging. For massive data sets, kriging is computationally intensive, both in terms of CPU time and memory, and so fixed rank kriging has been proposed as a solution. The method however still involves operations on large matrices, so we develop an alteration to this method by utilizing the approximations made in fixed rank kriging combined with restricted maximum likelihood estimation and sparse matrix methodology. Experiments show that our methodology can provide additional gains in computational efficiency over fixed-rank kriging without loss of accuracy in prediction. The methodology is applied to climate data archived by the United States National Climate Data Center, with very good results.

[1]  David Higdon,et al.  A process-convolution approach to modelling temperatures in the North Atlantic Ocean , 1998, Environmental and Ecological Statistics.

[2]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[3]  Matthias Katzfuss,et al.  A Multi-Resolution Approximation for Massive Spatial Datasets , 2015, 1507.04789.

[4]  H. Bayraktar,et al.  A Kriging-based approach for locating a sampling site—in the assessment of air quality , 2005 .

[5]  Timothy A. Davis,et al.  Direct methods for sparse linear systems , 2006, Fundamentals of algorithms.

[6]  Zhiyi Chi,et al.  Approximating likelihoods for large spatial data sets , 2004 .

[7]  Richard A. Brown,et al.  Introduction to random signals and applied kalman filtering (3rd ed , 2012 .

[8]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[9]  R. Beatson,et al.  Smooth fitting of geophysical data using continuous global surfaces , 2002 .

[10]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[11]  Noel A Cressie,et al.  Statistics for Spatial Data, Revised Edition. , 1994 .

[12]  D. Nychka Spatial‐Process Estimates as Smoothers , 2012 .

[13]  N. Cressie,et al.  Fixed rank kriging for very large spatial data sets , 2008 .

[14]  D. Nychka,et al.  Covariance Tapering for Interpolation of Large Spatial Datasets , 2006 .

[15]  Michael L. Stein,et al.  Interpolation of spatial data , 1999 .

[16]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[17]  D. Krige A statistical approach to some basic mine valuation problems on the Witwatersrand, by D.G. Krige, published in the Journal, December 1951 : introduction by the author , 1951 .

[18]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[19]  A. Stein,et al.  Spatial statistics for remote sensing , 2002 .

[20]  D. Nychka,et al.  A Multiresolution Gaussian Process Model for the Analysis of Large Spatial Datasets , 2015 .

[21]  M. Fuentes Spectral methods for nonstationary spatial processes , 2002 .

[22]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , 2011 .

[23]  Douglas W. Nychka,et al.  Covariance Tapering for Likelihood-Based Estimation in Large Spatial Data Sets , 2008 .

[24]  Andrew O. Finley,et al.  Improving the performance of predictive process modeling for large datasets , 2009, Comput. Stat. Data Anal..

[25]  Matthias Katzfuss,et al.  Spatio‐temporal smoothing and EM estimation for massive remote‐sensing data sets , 2011 .

[26]  Douglas W. Nychka,et al.  FUNFITS: data analysis and statistical tools for estimating functions , 2008 .

[27]  Noel A Cressie,et al.  Fast, Resolution-Consistent Spatial Prediction of Global Processes From Satellite Data , 2002 .

[28]  Timothy A. Davis,et al.  Direct Methods for Sparse Linear Systems (Fundamentals of Algorithms 2) , 2006 .

[29]  Stephen Billings,et al.  Interpolation of geophysical data using continuous global surfaces , 2002 .

[30]  J. Andrew Royle,et al.  Multiresolution models for nonstationary spatial covariance functions , 2002 .

[31]  Nicholas J. Higham,et al.  INVERSE PROBLEMS NEWSLETTER , 1991 .

[32]  Noel A. C. Cressie,et al.  Statistics for Spatial Data: Cressie/Statistics , 1993 .

[33]  Noel A Cressie,et al.  Variance-Covariance Modeling and Estimation for Multi-Resolution Spatial Models , 2004 .

[34]  S. R. Searle,et al.  On Deriving the Inverse of a Sum of Matrices , 1981 .

[35]  Andrew Richmond,et al.  Financially Efficient Ore Selections Incorporating Grade Uncertainty , 2003 .

[36]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[37]  S. R. Searle,et al.  Matrix Algebra Useful for Statistics , 1982 .

[38]  Frederick Mosteller,et al.  Understanding robust and exploratory data analysis , 1983 .

[39]  Gardar Johannesson,et al.  Dynamic multi-resolution spatial models , 2007, Environmental and Ecological Statistics.

[40]  Douglas W. Nychka,et al.  Tools for Spatial Data , 2016 .

[41]  Craig J. Johns,et al.  Infilling Sparse Records of Spatial Fields , 2003 .

[42]  A. V. Vecchia Estimation and model identification for continuous spatial processes , 1988 .