An anisotropic Matern spatial covariance model: REML estimation and properties.

This thesis concerns the development, estimation and investigation of a general anisotropic spatial correlation function, within model-based geostatistics, expressed as a Gaussian linear mixed model, and estimated using residual maximum likelihood (REML). The Matern correlation function is attractive because of its parameter which controls smoothness of the spatial process, and which can be estimated from the data. This function is combined with geometric anisotropy, with an extension permitting different distance metrics, forming a flexible spatial covariance model which incorporates as special cases many infiniterange spatial covariance functions in common use. Derivatives of the residual log-likelihood with respect to the four correlation-model parameters are derived, and the REML algorithm coded in Splus for testing and refinement as a precursor to its implementation into the software ASReml, with additional generality of linear mixed models. Suggestions are given regarding initial values for the estimation. A residual likelihood ratio test for anisotropy is also developed and investigated. Application to three soil-based examples reveals that anisotropy does occur in practice, and that this technique is able to fit covariance models previously unavailable or inaccessible. Simulations of isotropic and anisotropic data with and without a nugget effect reveal the following principal points. Inclusion of some closely-spaced locations greatly improves estimation, particularly of the Matern smoothness parameter, and of the nugget variance when present. The presence of geometric anisotropy does not adversely affect parameter estimation. Presence of a nugget effect introduces greater uncertainty into the parameter estimates, most dramatically for the smoothness parameter, and also increases the chance of non-convergence and decreases the power of the test for anisotropy. Estimation is more difficult with very “unsmooth” processes (Matern smoothness parameter 0.1 or 0.25) — nonconvergence is more likely and estimates are less precise and/or more biased. However it is still often possible to fit the full model including both anisotropy and nugget effect using REML with as few as 100 observations. Additional simulations involving model misspecification reveal that ignoring anisotropy when it is present can substantially increase the mean squared error of prediction, but overfitting by attempting to model anisotropy when it is absent is less damaging. Further, plug-in estimates of prediction error variance are reasonable estimates of the actual mean squared error of prediction, regardless of the model fitted, weakening the argument requiring Bayesian approaches to properly allow for uncertainty in the parameter estimates when estimating prediction error variance. The most valuable outcome of this research is the implementation of an anisotropic Matern correlation function in ASReml, including the full generality of Gaussian linear

[1]  Irene A. Stegun,et al.  Handbook of Mathematical Functions. , 1966 .

[2]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[3]  P. Diggle Time Series: A Biostatistical Introduction , 1990 .

[4]  Peter J. Diggle,et al.  An Introduction to Model-Based Geostatistics , 2003 .

[5]  N. Cressie,et al.  Statistics for Spatial Data. , 1992 .

[6]  G. Matheron Principles of geostatistics , 1963 .

[7]  J. Chilès,et al.  Geostatistics: Modeling Spatial Uncertainty , 1999 .

[8]  E. Kreyszig,et al.  Advanced Engineering Mathematics. , 1974 .

[9]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[10]  Geoffrey S. Watson Trend Surface Analysis and Spatial Correlation , 1969 .

[11]  R. Davies Hypothesis testing when a nuisance parameter is present only under the alternative , 1977 .

[12]  P. Diggle,et al.  Bayesian Inference in Gaussian Model-based Geostatistics , 2002 .

[13]  Zhiyi Chi,et al.  Approximating likelihoods for large spatial data sets , 2004 .

[14]  M. Stein,et al.  Spatial sampling design for prediction with estimated parameters , 2006 .

[15]  Robin Thompson,et al.  Average information REML: An efficient algorithm for variance parameter estimation in linear mixed models , 1995 .

[16]  R. Lark,et al.  Model‐based analysis using REML for inference from systematically sampled data on soil , 2004 .

[17]  A. V. Vecchia Estimation and model identification for continuous spatial processes , 1988 .

[18]  P. Diggle,et al.  Model‐based geostatistics , 2007 .

[19]  B. Minasny,et al.  The Matérn function as a general model for soil variograms , 2005 .

[20]  G. Matheron The intrinsic random functions and their applications , 1973, Advances in Applied Probability.

[21]  A. Verbyla,et al.  A CONDITIONAL DERIVATION OF RESIDUAL MAXIMUM LIKELIHOOD , 1990 .

[22]  H. Beecher,et al.  Improved method for assessing rice soil suitability to restrict recharge , 2002 .

[23]  M. Stein,et al.  A Bayesian analysis of kriging , 1993 .

[24]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[25]  P. Guttorp,et al.  Studies in the history of probability and statistics XLIX On the Matérn correlation family , 2006 .

[26]  P. Diggle,et al.  Model-based geostatistics (with discussion). , 1998 .

[27]  R. Lark,et al.  On spatial prediction of soil properties in the presence of a spatial trend: the empirical best linear unbiased predictor (E‐BLUP) with REML , 2006 .

[28]  Brian R. Cullis,et al.  Anisotropic Matérn correlation and spatial prediction using REML , 2007 .

[29]  R. Reese Geostatistics for Environmental Scientists , 2001 .

[30]  J. R. Wallis,et al.  An Approach to Statistical Spatial-Temporal Modeling of Meteorological Fields , 1994 .

[31]  P. Kitanidis Introduction to Geostatistics: Applications in Hydrogeology , 1997 .