spThin: an R package for spatial thinning of species occurrence records for use in ecological niche models

Spatial thinning of species occurrence records can help address problems associated with spatial sampling biases. Ideally, thinning removes the fewest records necessary to substantially reduce the effects of sampling bias, while simultaneously retaining the greatest amount of useful information. Spatial thinning can be done manually; however, this is prohibitively time consuming for large datasets. Using a randomization approach, the ‘thin’ function in the spThin R package returns a dataset with the maximum number of records for a given thinning distance, when run for sufficient iterations. We here provide a worked example for the Caribbean spiny pocket mouse, where the results obtained match those of manual thinning.

[1]  Robert P. Anderson,et al.  Species-specific tuning increases robustness to sampling bias in models of species distributions: An implementation with Maxent , 2011 .

[2]  J. Elith,et al.  Species distribution modeling with R , 2016 .

[3]  A. Peterson,et al.  New developments in museum-based informatics and applications in biodiversity analysis. , 2004, Trends in ecology & evolution.

[4]  Thiago F. Rangel,et al.  Evaluating, partitioning, and mapping the spatial autocorrelation component in ecological niche modeling: a new approach based on environmentally equidistant records , 2014 .

[5]  Robert A. Boria,et al.  Spatial filtering to reduce sampling bias can improve the performance of ecological niche models , 2014 .

[6]  J. Thomson,et al.  In defense of , 1990 .

[7]  M. Araújo,et al.  Five (or so) challenges for species distribution modelling , 2006 .

[8]  C. Margules,et al.  Data requirements and data sources for biodiversity priority area selection , 2002, Journal of Biosciences.

[9]  R. Hijmans,et al.  Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null model. , 2012, Ecology.

[10]  Robert P. Anderson,et al.  Making better Maxent models of species distributions: complexity, overfitting and evaluation , 2014 .

[11]  Manuel Hernández Fernández,et al.  The biogeographic history of ruminant faunas determines the phylogenetic structure of their assemblages at different scales , 2014 .

[12]  J. Elith,et al.  Species Distribution Models: Ecological Explanation and Prediction Across Space and Time , 2009 .

[13]  R. Kadmon,et al.  EFFECT OF ROADSIDE BIAS ON THE ACCURACY OF PREDICTIVE MAPS PRODUCED BY BIOCLIMATIC MODELS , 2004 .

[14]  Matthew J. Smith,et al.  The Effects of Sampling Bias and Model Complexity on the Predictive Performance of MaxEnt Species Distribution Models , 2013, PloS one.

[15]  J. Engler,et al.  Mapping Species Distributions with MAXENT Using a Geographically Biased Sample of Presence Data: A Performance Assessment of Methods for Correcting Sampling Bias , 2014, PloS one.

[16]  Sam Veloz,et al.  Spatially autocorrelated sampling falsely inflates measures of accuracy for presence‐only niche models , 2009 .

[17]  Robert P. Anderson,et al.  Environmental filters reduce the effects of sampling bias and improve predictions of ecological niche models , 2014 .

[18]  Robert P. Anderson,et al.  Ecological Niches and Geographic Distributions , 2011 .

[19]  M. Araújo,et al.  Uses and misuses of bioclimatic envelope modeling. , 2012, Ecology.

[20]  Robert P. Anderson,et al.  Estimating optimal complexity for ecological niche models: A jackknife approach for species with small sample sizes , 2013 .

[21]  G. Hoarau,et al.  Improving Transferability of Introduced Species’ Distribution Models: New Tools to Forecast the Spread of a Highly Invasive Seaweed , 2013, PloS one.

[22]  Z. Huaman,et al.  Assessing the Geographic Representativeness of Genebank Collections: the Case of Bolivian Wild Potatoes , 2000, Conservation biology : the journal of the Society for Conservation Biology.

[23]  Robert P. Anderson,et al.  The effect of the extent of the study region on GIS models of species geographic distributions and estimates of niche evolution: preliminary tests with montane rodents (genus Nephelomys) in Venezuela , 2010 .

[24]  Steven J. Phillips,et al.  Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. , 2009, Ecological applications : a publication of the Ecological Society of America.

[25]  Dan L Warren,et al.  In defense of 'niche modeling'. , 2012, Trends in ecology & evolution.

[26]  David S. Johnson,et al.  The NP-Completeness Column: An Ongoing Guide , 1982, J. Algorithms.

[27]  B A Wintle,et al.  Modeling species-habitat relationships with spatially autocorrelated observation data. , 2006, Ecological applications : a publication of the Ecological Society of America.

[28]  Robert P. Anderson,et al.  Harnessing the world's biodiversity data: promise and peril in ecological niche modeling of species distributions , 2012, Annals of the New York Academy of Sciences.

[29]  S. Reddy,et al.  Geographical sampling bias and its implications for conservation priorities in Africa , 2003 .

[30]  Boris Schröder,et al.  The importance of correcting for sampling bias in MaxEnt species distribution models , 2013 .

[31]  Miguel Nakamura Savoy Predicting species distributions from small numbers of occurrence records: a test case using cryptic geckos in Madagascar , 2007 .

[32]  Robert P. Anderson,et al.  Real vs. artefactual absences in species distributions: tests for Oryzomys albigularis (Rodentia: Muridae) in Venezuela , 2003 .