Methods to account for spatial autocorrelation in the analysis of species distributional data : a review

Species distributional or trait data based on range map (extent-of-occurrence) or atlas survey data often display spatial autocorrelation, i.e. locations close to each other exhibit more similar values than those further apart. If this pattern remains present in the residuals of a statistical model based on such data, one of the key assumptions of standard statistical analyses, that residuals are independent and identically distributed (i.i.d), is violated. The violation of the assumption of i.i.d. residuals may bias parameter estimates and can increase type I error rates (falsely rejecting the null hypothesis of no effect). While this is increasingly recognised by researchers analysing species distribution data, there is, to our knowledge, no comprehensive overview of the many available spatial statistical methods to take spatial autocorrelation into account in tests of statistical significance. Here, we describe six different statistical approaches to infer correlates of species’ distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations. A comprehensive comparison of the relative merits of these methods is beyond the scope of this paper. To demonstrate each method’s implementation, however, we undertook preliminary tests based on simulated data. These preliminary tests verified that most of the spatial modeling techniques we examined showed good type I error control and precise parameter estimates, at least when confronted with simplistic simulated data containing

[1]  Marcel J. E. Golay,et al.  Binary coding , 1954, Transactions of the IRE Professional Group on Information Theory.

[2]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[3]  B. Mosse PLANT GROWTH RESPONSES TO VESICULAR-ARBUSCULAR MYCORRHIZA , 1973 .

[4]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[5]  R. Sokal,et al.  Spatial autocorrelation in biology: 1. Methodology , 1978 .

[6]  Robert R. Sokal,et al.  Spatial autocorrelation in biology: 2. Some biological implications and four applications of evolutionary and ecological interest , 1978 .

[7]  Thomas H. Wonnacott,et al.  Regression: A Second Course in Statistics. , 1981 .

[8]  S. Hurlbert Pseudoreplication and the Design of Ecological Field Experiments , 1984 .

[9]  J. Ord,et al.  Spatial Processes: Models and Applications , 1984 .

[10]  J. Whittaker Model Interpretation from the Additive Elements of the Likelihood Function , 1984 .

[11]  Anne Lohrli Chapman and Hall , 1985 .

[12]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[13]  R. H. Myers Classical and modern regression with applications , 1986 .

[14]  L. Anselin Spatial Econometrics: Methods and Models , 1988 .

[15]  D Hémon,et al.  Assessing the significance of the correlation between two spatial processes. , 1989, Biometrics.

[16]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[17]  R. Bilonick An Introduction to Applied Geostatistics , 1989 .

[18]  Michael Edward Hohn,et al.  An Introduction to Applied Geostatistics: by Edward H. Isaaks and R. Mohan Srivastava, 1989, Oxford University Press, New York, 561 p., ISBN 0-19-505012-6, ISBN 0-19-505013-4 (paperback), $55.00 cloth, $35.00 paper (US) , 1991 .

[19]  J. Besag,et al.  Bayesian image restoration, with two applications in spatial statistics , 1991 .

[20]  G. Casella,et al.  Explaining the Gibbs Sampler , 1992 .

[21]  P. Legendre Spatial Autocorrelation: Trouble or New Paradigm? , 1993 .

[22]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[23]  Noel A Cressie,et al.  Spatial models for spatial statistics: some unification , 1993 .

[24]  P. Clifford,et al.  Modifying the t test for assessing the correlation between two spatial processes , 1993 .

[25]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[26]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[27]  Paul E. Smith,et al.  Autocorrelation in the logistic regression modelling of species distributions , 1994 .

[28]  Stephen T. Buckland,et al.  The role of simulation in modelling spatially correlated data , 1998 .

[29]  P S Albert,et al.  A generalized estimating equations approach for spatially correlated binary data: applications to the analysis of neuroimaging data. , 1995, Biometrics.

[30]  Brian Huntley,et al.  Climate and the distribution of Fallopia japonica: use of an introduced species to test the predictive capacity of response surfaces , 1995 .

[31]  S. T. Buckland,et al.  An autologistic model for the spatial distribution of wildlife , 1996 .

[32]  Erich Barke,et al.  Hierarchical partitioning , 1996, Proceedings of International Conference on Computer Aided Design.

[33]  G. Golub,et al.  Some large-scale matrix computation problems , 1996 .

[34]  P. Petraitis,et al.  Inferring multiple causality : the limitations of path analysis , 1996 .

[35]  Jonathan M. Graham,et al.  Autologistic Model of Spatial Pattern of Phytophthora Epidemic in Bell Pepper: Effects of Soil Variables on Disease Presence , 1997 .

[36]  Hulin Wu,et al.  Markov chain Monte Carlo for autologistic regression models with application to the distribution of plant species , 1998 .

[37]  Stephen P. Kaluzny,et al.  S+SpatialStats: User’s Manual for Windows® and UNIX® , 1998 .

[38]  S. Fotheringham,et al.  Geographically Weighted Regression , 1998 .

[39]  J. Diniz‐Filho,et al.  AN EIGENVECTOR METHOD FOR ESTIMATING PHYLOGENETIC INERTIA , 1998, Evolution; international journal of organic evolution.

[40]  B. Everitt,et al.  Analysis of longitudinal data , 1998, British Journal of Psychiatry.

[41]  Daniel A. Griffith,et al.  A Variance-Stabilizing Coding Scheme for Spatial Link Matrices , 1999 .

[42]  P. Beja,et al.  The use of sighting data to analyse Iberian lynx habitat and distribution , 1999 .

[43]  Jennifer A. Hoeting,et al.  An Improved Model for Spatially Correlated Binary Responses , 2000 .

[44]  Antoine Guisan,et al.  Predictive habitat distribution models in ecology , 2000 .

[45]  D. Griffith Eigenfunction properties and approximations of selected incidence matrices employed in spatial analyses , 2000 .

[46]  Miguel B. Araújo,et al.  Selecting areas for species persistence using occurrence data , 2000 .

[47]  Jack J. Lennon,et al.  Red-shifts and red herrings in geographical ecology , 2000 .

[48]  Daniel A. Griffith,et al.  A linear regression solution to the spatial autocorrelation problem , 2000, J. Geogr. Syst..

[49]  S. Manel,et al.  Evaluating presence-absence models in ecology: the need to account for prevalence , 2001 .

[50]  D. Bates,et al.  Mixed-Effects Models in S and S-PLUS , 2001 .

[51]  M. Sykes Modelling the potential distribution and community dynamics of lodgepole pine (Pinus contorta Dougl. ex. Loud.) in Scandinavia , 2001 .

[52]  V. Carey,et al.  Mixed-Effects Models in S and S-Plus , 2001 .

[53]  Miska Luoto,et al.  Determinants of distribution and abundance in the clouded apollo butterfly: a landscape ecological approach , 2001 .

[54]  Dr Robert Bryant,et al.  Modelling landscape-scale habitat use using GIS and remote sensing : a case study with great bustards , 2001 .

[55]  Jessica Gurevitch,et al.  Ecography 25: 601 -- 615, 2002 , 2022 .

[56]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[57]  Jennifer L. Dungan,et al.  Illustrations and guidelines for selecting statistical methods for quantifying spatial pattern in ecological data , 2002 .

[58]  Luc Anselin,et al.  Under the hood , 2002 .

[59]  C. Rahbek,et al.  Geographic Range Size and Determinants of Avian Species Richness , 2002, Science.

[60]  Eric R. Ziegel,et al.  An Introduction to Generalized Linear Models , 2002, Technometrics.

[61]  P. Dixon,et al.  Accounting for Spatial Pattern When Modeling Organism- Environment Interactions , 2022 .

[62]  J. Michael Scott,et al.  Predicting Species Occurrences: Issues of Accuracy and Scale , 2002 .

[63]  Pierre Legendre,et al.  All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices , 2002 .

[64]  T. Simons,et al.  Spatial autocorrelation and autoregressive models in ecology , 2002 .

[65]  Jessica Gurevitch,et al.  Ecography 25: 553 -- 557, 2002 , 2022 .

[66]  D. Macdonald,et al.  HABITAT PREFERENCES OF FERAL AMERICAN MINK IN THE UPPER THAMES , 2003 .

[67]  Lars Edenius,et al.  Effective Field Sampling for Predicting the Spatial Distribution of Reindeer (Rangifer tarandus) with Help of the Gibbs Sampler , 2003, Ambio.

[68]  H. Preisler,et al.  DEVELOPING PROBABILISTIC MODELS TO PREDICT AMPHIBIAN SITE OCCUPANCY IN A PATCHY LANDSCAPE , 2003 .

[69]  A. Gelfand,et al.  Proper multivariate conditional autoregressive models for spatial data analysis. , 2003, Biostatistics.

[70]  J. Diniz‐Filho,et al.  Spatial autocorrelation and red herrings in geographical ecology , 2003 .

[71]  Stephen P Brooks,et al.  Bayesian computation: a statistical revolution , 2003, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[72]  M. Graham CONFRONTING MULTICOLLINEARITY IN ECOLOGICAL MULTIPLE REGRESSION , 2003 .

[73]  J. Brownstein,et al.  A climate-based model predicts the spatial distribution of the Lyme disease vector Ixodes scapularis in the United States. , 2003, Environmental health perspectives.

[74]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[75]  Robert Haining,et al.  Spatial Data Analysis: Theory and Practice , 2003 .

[76]  T. Dawson,et al.  Predicting the impacts of climate change on the distribution of species: are bioclimate envelope models useful? , 2003 .

[77]  Julie Zhou,et al.  Autologistic regression model for the distribution of vegetation , 2003 .

[78]  M. Tognelli,et al.  Analysis of determinants of mammalian species richness in South America using spatial autoregressive models , 2004 .

[79]  M. Wall A close look at the spatial structure implied by the CAR and SAR models , 2004 .

[80]  S. Dark,et al.  The biogeography of invasive alien plants in California: an application of GIS and spatial regression analysis , 2004 .

[81]  S. Ferrier,et al.  Extended statistical approaches to modelling spatial pattern in biodiversity in northeast New South Wales. I. Species-level modelling , 2004, Biodiversity & Conservation.

[82]  L. Waller,et al.  Applied Spatial Statistics for Public Health Data: Waller/Applied Spatial Statistics , 2004 .

[83]  M. Araújo,et al.  An evaluation of methods for modelling species distributions , 2004 .

[84]  Robin M. Reich,et al.  Predicting the location of northern goshawk nests: modeling the spatial dependency between nest locations and forest structure , 2004 .

[85]  W. Falck,et al.  Nonparametric spatial covariance functions: Estimation and testing , 2001, Environmental and Ecological Statistics.

[86]  M. Fortin,et al.  Spatial pattern and ecological analysis , 1989, Vegetatio.

[87]  Mevin B. Hooten,et al.  Predicting the spatial distribution of ground flora on large domains using a hierarchical Bayesian model , 2003, Landscape Ecology.

[88]  Michael Drielsma,et al.  Extended statistical approaches to modelling spatial pattern in biodiversity in northeast New South Wales. II. Community-level modelling , 2002, Biodiversity & Conservation.

[89]  Wayne E. Thogmartin,et al.  A HIERARCHICAL SPATIAL MODEL OF AVIAN ABUNDANCE WITH APPLICATION TO CERULEAN WARBLERS , 2004 .

[90]  Hulin Wu,et al.  Modelling the distribution of plant species using the autologistic regression model , 1997, Environmental and Ecological Statistics.

[91]  Giles M. Foody,et al.  Spatial nonstationarity and scale-dependency in the relationship between species richness and environmental determinants for the sub-Saharan endemic avifauna , 2004 .

[92]  A. Gelfand,et al.  Modelling species diversity through species level hierarchical modelling , 2005 .

[93]  José Alexandre Felizola Diniz-Filho,et al.  Modelling geographical patterns in species richness using eigenvector-based spatial filters , 2005 .

[94]  P. Diggle Applied Spatial Statistics for Public Health Data , 2005 .

[95]  Walter Jetz,et al.  Local and global approaches to spatial data analysis in ecology , 2005 .

[96]  W. Thuiller,et al.  Predicting species distribution: offering more than simple habitat models. , 2005, Ecology letters.

[97]  M. Fortin,et al.  Spatial Analysis: A Guide for Ecologists 1st edition , 2005 .

[98]  N. Augustin,et al.  Analyzing the spread of beech canker , 2005 .

[99]  R. Swihart,et al.  MODELING PATCH OCCUPANCY BY FOREST RODENTS: INCORPORATING DETECTABILITY AND SPATIAL AUTOCORRELATION WITH HIERARCHICALLY STRUCTURED DATA , 2005 .

[100]  R. G. Davies,et al.  Global hotspots of species richness are not congruent with endemism or threat , 2005, Nature.

[101]  Andreas Oschlies,et al.  Global Patterns of Predator Diversity in the Open Oceans , 2005, Science.

[102]  B A Wintle,et al.  Modeling species-habitat relationships with spatially autocorrelated observation data. , 2006, Ecological applications : a publication of the Ecological Society of America.

[103]  Calvin A. Farris,et al.  Incorporating spatial non-stationarity of regression coefficients into predictive vegetation models , 2007, Landscape Ecology.

[104]  A. Desrochers,et al.  Spatial Aggregation of Forest Songbird Territories and Possible Implications for Area Sensitivity , 2006 .

[105]  Daniel A Griffith,et al.  Spatial modeling in ecology: the flexibility of eigenfunction spatial analyses. , 2006, Ecology.

[106]  Shanshan Wu,et al.  Building statistical models to analyze species distributions. , 2006, Ecological applications : a publication of the Ecological Society of America.

[107]  Ingolf Kühn,et al.  Relating geographical variation in pollination types to environmental and spatial factors using novel statistical methods. , 2006, The New phytologist.

[108]  A. Townsend Peterson,et al.  Novel methods improve prediction of species' distributions from occurrence data , 2006 .

[109]  Justin M. J. Travis,et al.  Modelling establishment probabilities of an exotic plant, Rhododendron ponticum, invading a heterogeneous, woodland landscape using logistic regression with spatial autocorrelation , 2006 .

[110]  Mark D. Piorecky,et al.  Multiple spatial scale logistic and autologistic habitat selection models for northern pygmy owls, along the eastern slopes of Alberta’s Rocky Mountains , 2006 .

[111]  M. Araújo,et al.  Consequences of spatial autocorrelation for niche‐based models , 2006 .

[112]  D. Gavin,et al.  Spatial variation of climatic and non‐climatic controls on species distribution: the range limit of Tsuga heterophylla , 2006 .

[113]  William A Link,et al.  Model weights and the foundations of multimodel inference. , 2006, Ecology.

[114]  T. Edwards,et al.  A Variance-decomposition Approach to Investigating Multiscale Habitat Associations , 2006 .

[115]  Matthew G. Betts,et al.  The importance of spatial autocorrelation, extent and resolution in predicting forest bird occurrence , 2006 .

[116]  M. Araújo,et al.  How Does Climate Change Affect Biodiversity? , 2006, Science.

[117]  Ingolf Kühn,et al.  Incorporating spatial autocorrelation may invert observed patterns , 2006 .

[118]  M. Kaboli,et al.  Avifaunal gradients in two arid zones of central Iran in relation to vegetation, climate, and topography , 2006 .

[119]  Thiago F. Rangel,et al.  Towards an integrated computational tool for spatial analysis in macroecology and biogeography , 2006 .

[120]  R. G. Davies,et al.  Human impacts and the global distribution of extinction risk , 2006, Proceedings of the Royal Society B: Biological Sciences.

[121]  Stéphane Dray,et al.  Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices (PCNM) , 2006 .

[122]  Jason K. Blackburn,et al.  Does GARP really fail miserably? A response to Stockman et al. (2006 ) , 2006 .

[123]  S. Wood Generalized Additive Models: An Introduction with R , 2006 .

[124]  Otso Ovaskainen,et al.  Can the cause of aggregation be inferred from species distributions , 2007 .

[125]  T. E. Morrell,et al.  Predictive Occurrence Models for Bat Species in California , 2007 .

[126]  C. Dormann Effects of incorporating spatial autocorrelation into the analysis of species distribution data , 2007 .

[127]  W. Jetz,et al.  Effects of species’ ecology on the accuracy of distribution models , 2007 .

[128]  E. Silverman,et al.  Managing breeding forest songbirds with conspecific song playbacks , 2007 .

[129]  S. Suárez‐Seoane,et al.  Non‐stationarity and local approaches to modelling the distributions of wildlife , 2007 .

[130]  David J. Currie,et al.  Disentangling the roles of environment and space in ecology , 2007 .

[131]  C. Dormann Assessing the validity of autologistic regression , 2007 .

[132]  Jennifer A. Miller,et al.  Incorporating spatial dependence in predictive vegetation models , 2007 .

[133]  Brian J. McGill,et al.  Can niche-based distribution models outperform spatial interpolation? , 2007 .

[134]  J. Diniz‐Filho,et al.  Red herrings revisited: spatial autocorrelation and parameter estimation in geographical ecology , 2007 .

[135]  C. Dormann Promising the future? Global change projections of species distributions , 2007 .

[136]  W. D. Kissling,et al.  Spatial autocorrelation and the selection of simultaneous autoregressive models , 2007 .

[137]  G. Carl,et al.  Analyzing spatial autocorrelation in species distributions using Gaussian and logit models , 2007 .

[138]  Ingolf Kühn,et al.  Analyzing spatial ecological data using linear regression and wavelet analysis , 2008 .

[139]  Carsten F. Dormann,et al.  A wavelet-based method to remove spatial autocorrelation in the analysis of species distributional data , 2008 .

[140]  Adam S Hadley,et al.  Social information trumps vegetation structure in breeding-site selection by a migrant songbird , 2008, Proceedings of the Royal Society B: Biological Sciences.

[141]  Mevin B. Hooten,et al.  Hierarchical Spatial Models , 2008, Encyclopedia of GIS.

[142]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.