A systematic investigation of cross-validation in GWR model estimation: empirical analysis and Monte Carlo simulations

In geographically weighted regression, one must determine a window size which will be used to subset the data locally. Typically, a cross-validation procedure is used to determine a globally optimal window size. Preliminary investigations indicate that the global cross-validation score is heavily influenced by a small number of observations in the dataset. At present, the ramifications of this behaviour in cross-validation are unknown. The research reported here explores the extent to which individual and groups of observations impact optimal window size determination, and whether one can explain why some points are more influential than others. In addition, we strive to examine the impact neighbourhood specification has on model quality in terms of predictive capabilities and the ability of the method to retrieve spatially varying processes. The analysis is based on several datasets and using simulated data in order to compare and validate results. The results provide some practical guidelines for the use of cross-validation.

[1]  Kazuaki Miyamoto,et al.  Spatial Association and Heterogeneity Issues in Land Price Models , 2001 .

[2]  J. Fox Applied Regression Analysis, Linear Models, and Related Methods , 1997 .

[3]  Steven Farber,et al.  A Comparison of Localized Regression Models in a Hedonic House Price Context , 2006 .

[4]  David C. Wheeler,et al.  An assessment of coefficient accuracy in linear regression models with spatially varying coefficients , 2007, J. Geogr. Syst..

[5]  Jeffrey H. Gove,et al.  Spatial Assessment of Model Errors from Four Regression Techniques , 2005 .

[6]  Daniel A. Griffith,et al.  Advanced Spatial Statistics , 1988 .

[7]  A S Fotheringham,et al.  Geographically weighted Poisson regression for disease association mapping , 2005, Statistics in medicine.

[8]  David Wheeler,et al.  Multicollinearity and correlation among local regression coefficients in geographically weighted regression , 2005, J. Geogr. Syst..

[9]  Ning Wang,et al.  Local Linear Estimation of Spatially Varying Coefficient Models: An Improvement on the Geographically Weighted Regression Technique , 2008 .

[10]  L. Anselin Spatial Econometrics: Methods and Models , 1988 .

[11]  Danlin Yu,et al.  Spatially varying development mechanisms in the Greater Beijing Area: a geographically weighted regression investigation , 2006 .

[12]  Eric R. Ziegel,et al.  Geographically Weighted Regression , 2006, Technometrics.

[13]  Kazuaki Miyamoto,et al.  A General Framework for Estimation and Inference of Geographically Weighted Regression Models: 2. Spatial Association and Model Specification Tests , 2002 .

[14]  Jeffrey H. Gove,et al.  Spatial residual analysis of six modeling techniques , 2005 .

[15]  D. Griffith,et al.  Advanced Spatial Statistics: Special Topics in the Exploration of Quantitative Spatial Data Series , 1988 .

[16]  A. Páez,et al.  A General Framework for Estimation and Inference of Geographically Weighted Regression Models: 1. Location-Specific Kernel Bandwidths and a Test for Locational Heterogeneity , 2002 .