Incomplete Cholesky decomposition for the kriging of large datasets

Abstract Kriging of very large spatial datasets is a challenging problem. The size n of the dataset causes problems in computing the kriging estimate: solving the kriging equations directly involves inverting an n × n covariance matrix. This operation requires O ( n 3 ) computations and a storage of O ( n 2 ) . Under these circumstances, straightforward kriging of massive datasets is not possible. Several approaches have been proposed in the literature among which two main families exist: sparse approximation of the covariance function and low rank approaches. We propose here an approach that is built upon a low rank approximation of the covariance matrix obtained by incomplete Cholesky decomposition. This algorithm requires O ( n k ) storage and takes O ( n k 2 ) arithmetic operations, where k is the rank of the approximation, whose accuracy is controlled by a parameter. We detail the main properties of this method and explore its links with existing methods. Its benefits are illustrated on simple examples and compared to those of existing approaches. Finally, we show that this low rank representation is also suited for inverse conditioning of Gaussian random fields.

[1]  Michael L. Stein,et al.  A simple condition for asymptotic optimality of linear predictions of random fields , 1993 .

[2]  D. Nychka,et al.  Covariance Tapering for Interpolation of Large Spatial Datasets , 2006 .

[3]  Alexander Gribov,et al.  Geostatistical Mapping with Continuous Moving Neighborhood , 2004 .

[4]  N. Cressie,et al.  Fixed rank kriging for very large spatial data sets , 2008 .

[5]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[6]  Jacques Rivoirard,et al.  Continuity for Kriging with Moving Neighborhood , 2011 .

[7]  J. Andrew Royle,et al.  Multiresolution models for nonstationary spatial covariance functions , 2002 .

[8]  Jianhua Z. Huang,et al.  A full scale approximation of covariance functions for large spatial data sets , 2012 .

[9]  Peter K. Kitanidis,et al.  Efficient methods for large‐scale linear inversion using a geostatistical approach , 2012 .

[10]  Michel Loève,et al.  Probability Theory I , 1977 .

[11]  Michael I. Jordan,et al.  Predictive low-rank decomposition for kernel methods , 2005, ICML.

[12]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[13]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003 .

[14]  Ying Sun,et al.  Geostatistics for Large Datasets , 2012 .

[15]  Michael L. Stein,et al.  A modeling approach for large spatial datasets , 2008 .

[16]  R. Ghanem,et al.  Stochastic Finite Elements: A Spectral Approach , 1990 .

[17]  T. Romary Integrating production data under uncertainty by parallel interacting Markov chains on a reduced dimensional space , 2009 .

[18]  J. Chilès,et al.  Geostatistics: Modeling Spatial Uncertainty , 1999 .