Does the interpolation accuracy of species distribution models come at the expense of transferability

Model transferability (extrapolative accuracy) is one important feature in species distribution models, required in several ecological and conservation biological applications. This study uses 10 modelling techniques and nationwide data on both (1) species distribution of birds, butterflies, and plants and (2) climate and land cover in Finland to investigate whether good interpolative prediction accuracy for models comes at the expense of transferability – i.e. markedly worse performance in new areas. Models’ interpolation and extrapolation performance was primarily assessed using AUC (the area under the curve of a receiver characteristic plot) and Kappa statistics, with supplementary comparisons examining model sensitivity and specificity values. Our AUC and Kappa results show that extrapolation to new areas is a greater challenge for all included modelling techniques than simple filling of gaps in a well-sampled area, but there are also differences among the techniques in the degree of transferability. Among the machine-learning modelling techniques, MAXENT, generalized boosting methods (GBM), and artificial neural networks (ANN) showed good transferability while the performance of GARP and random forest (RF) decreased notably in extrapolation. Among the regression-based methods, generalized additive models (GAM) and generalized linear models (GLM) showed good transferability. A desirable combination of good prediction accuracy and good transferability was evident for three modelling techniques: MAXENT, GBM, and GAM. However, examination of model sensitivity and specificity revealed that model types may differ in their tendencies to either increased over-prediction of presences or absences in extrapolation, and some of the methods show contrasting changes in sensitivity vs specificity (e.g. ANN and GARP). Among the three species groups, the best transferability was seen with birds, followed closely by butterflies, whereas reliable extrapolation for plant species distribution models appears to be a major challenge at least at this scale. Overall, detailed knowledge of the behaviour of different techniques in various study settings and with different species groups is of utmost importance in predictive modelling.

[1]  C. Ricotta,et al.  Accounting for uncertainty when mapping species distributions: The need for maps of ignorance , 2011 .

[2]  A. Peterson,et al.  Use of niche models in invasive species risk assessments , 2011, Biological Invasions.

[3]  Horst Bischof,et al.  Stereoscopic motion analysis in densely packed clusters: 3D analysis of the shimmering behaviour in Giant honey bees , 2011, Frontiers in Zoology.

[4]  Tim Newbold,et al.  Applications and limitations of museum data for conservation and ecology, with particular attention to species distribution models , 2010 .

[5]  Alberto Jiménez-Valverde,et al.  The uncertain nature of absences and their importance in species distribution modelling , 2010 .

[6]  Mathieu Marmion,et al.  The performance of state-of-the-art modelling techniques depends on geographical distribution of species. , 2009 .

[7]  J. Franklin,et al.  Differences in spatial predictions among species distribution modeling methods vary with species traits and environmental predictors , 2009 .

[8]  H. Toivonen,et al.  Predicting distribution patterns and recent northward range shift of an invasive aquatic plant: Elodea canadensis in Europe , 2009 .

[9]  C. A. Howell,et al.  Niches, models, and climate change: Assessing the assumptions and uncertainties , 2009, Proceedings of the National Academy of Sciences.

[10]  Mathieu Marmion,et al.  Inclusion of soil data improves the performance of bioclimatic envelope models for insect species distributions in temperate Europe , 2009 .

[11]  P. P. Olea,et al.  Combining scales in habitat models to improve conservation planning in an endangered vulture , 2009 .

[12]  M. Araújo,et al.  BIOMOD – a platform for ensemble forecasting of species distributions , 2009 .

[13]  J. Pellet,et al.  The transferability of distribution models across regions: an amphibian case study , 2009 .

[14]  M. Leishman,et al.  Different climatic envelopes among invasive populations may lead to underestimations of current and future biological invasions , 2009 .

[15]  W. Hargrove,et al.  The projection of species distribution models and the problem of non-analog climate , 2009, Biodiversity and Conservation.

[16]  R. Real,et al.  Transferability of environmental favourability models in geographic space : The case of the Iberian desman (Galemys pyrenaicus) in Portugal and Spain , 2009 .

[17]  Tim M. Blackburn,et al.  Do climate envelope models transfer? A manipulative test using dung beetle introductions , 2009, Proceedings of the Royal Society B: Biological Sciences.

[18]  J. Elith,et al.  Species Distribution Models: Ecological Explanation and Prediction Across Space and Time , 2009 .

[19]  J. Elith,et al.  Do they? How do they? WHY do they differ? On finding reasons for differing performances of species distribution models , 2009 .

[20]  J. Franklin,et al.  Effect of species rarity on the accuracy of species distribution models for reptiles and amphibians in southern California , 2009 .

[21]  Alberto Jiménez-Valverde,et al.  Not as good as they seem: the importance of concepts in species distribution modelling , 2008 .

[22]  Miska Luoto,et al.  Modelling the occurrence of threatened plant species in taiga landscapes: methodological and ecological perspectives , 2008 .

[23]  Threat spots and environmental determinants of red-listed plant, butterfly and bird species in boreal agricultural environments , 2008, Biodiversity and Conservation.

[24]  Julian D Olden,et al.  Machine Learning Methods Without Tears: A Primer for Ecologists , 2008, The Quarterly Review of Biology.

[25]  M. Luoto,et al.  Species traits are associated with the quality of bioclimatic models , 2008 .

[26]  Steven J. Phillips Transferability, sample selection bias and background data in presence‐only modelling: a response to Peterson et al. (2007) , 2008 .

[27]  M. Sykes,et al.  Predicting global change impacts on plant species' distributions: Future challenges , 2008 .

[28]  R. Real,et al.  AUC: a misleading measure of the performance of predictive distribution models , 2008 .

[29]  M. Luoto,et al.  Biotic interactions improve prediction of boreal bird distributions at macro‐scales , 2007 .

[30]  Steven J. Phillips,et al.  WHAT MATTERS FOR PREDICTING THE OCCURRENCES OF TREES: TECHNIQUES, DATA, OR SPECIES' CHARACTERISTICS? , 2007 .

[31]  D. R. Cutler,et al.  Utah State University From the SelectedWorks of , 2017 .

[32]  A. Townsend Peterson,et al.  Transferability and model evaluation in ecological niche modeling: a comparison of GARP and Maxent , 2007 .

[33]  H. Van Dyck,et al.  Transferability of Species Distribution Models: a Functional Habitat Approach for Two Regionally Threatened Butterflies , 2007, Conservation biology : the journal of the Society for Conservation Biology.

[34]  M. Austin Species distribution models and ecological theory: A critical assessment and some possible new approaches , 2007 .

[35]  John E. Kutzbach,et al.  Projected distributions of novel and disappearing climates by 2100 AD , 2006, Proceedings of the National Academy of Sciences.

[36]  M. Luoto,et al.  The role of land cover in bioclimatic models depends on spatial resolution , 2006 .

[37]  Omri Allouche,et al.  Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS) , 2006 .

[38]  M. Sykes,et al.  Methods and uncertainties in bioclimatic envelope modelling under climate change , 2006 .

[39]  L. Belbin,et al.  Evaluation of statistical models used for predicting plant species distributions: Role of artificial data and theory , 2006 .

[40]  T. Hastie,et al.  Comparative performance of generalized additive models and multivariate adaptive regression splines for statistical modelling of species distributions , 2006 .

[41]  M. Luoto,et al.  Determinants of the biogeographical distribution of butterflies in boreal regions , 2006 .

[42]  M. Zappa,et al.  Are niche‐based species distribution models transferable in space? , 2006 .

[43]  T. Dawson,et al.  Model‐based uncertainty in species range prediction , 2006 .

[44]  M. Araújo,et al.  Five (or so) challenges for species distribution modelling , 2006 .

[45]  R. Pearson,et al.  Predicting species distributions from small numbers of occurrence records: A test case using cryptic geckos in Madagascar , 2006 .

[46]  M. Araújo,et al.  How Does Climate Change Affect Biodiversity? , 2006, Science.

[47]  M. Luoto,et al.  Does seasonal fine‐tuning of climatic variables improve the performance of bioclimatic envelope models for migratory birds? , 2006 .

[48]  C. Furlanello,et al.  Predicting habitat suitability with machine learning models: The potential area of Pinus sylvestris L. in the Iberian Peninsula , 2006 .

[49]  D. White,et al.  Predicting climate‐induced range shifts: model differences and model reliability , 2006 .

[50]  Jane Elith,et al.  Error and uncertainty in habitat models , 2006 .

[51]  J. Drake,et al.  Modelling ecological niches with support vector machines , 2006 .

[52]  J. Arntzen From descriptive to predictive distribution models: a working example with Iberian amphibians and reptiles , 2006, Frontiers in Zoology.

[53]  A. Townsend Peterson,et al.  Novel methods improve prediction of species' distributions from occurrence data , 2006 .

[54]  Robert P. Anderson,et al.  Maximum entropy modeling of species geographic distributions , 2006 .

[55]  Valentí Rull,et al.  Unexpected biodiversity loss under global warming in the neotropical Guayana Highlands: a preliminary appraisal , 2006 .

[56]  A. Prasad,et al.  Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction , 2006, Ecosystems.

[57]  M. Luoto,et al.  Distribution patterns of boreal marshland birds: modelling the relationships to land cover and climate , 2005 .

[58]  W. Thuiller,et al.  Predicting species distribution: offering more than simple habitat models. , 2005, Ecology letters.

[59]  M. Araújo,et al.  Validation of species–climate impact models under climate change , 2005 .

[60]  T. Dawson,et al.  Selecting thresholds of occurrence in the prediction of species distributions , 2005 .

[61]  W. Thuiller Patterns and uncertainties of species' range shifts under climate change , 2004 .

[62]  T. Dawson,et al.  Modelling species distributions in Britain: a hierarchical integration of climate and land-cover data , 2004 .

[63]  Mark R. Segal,et al.  Machine Learning Benchmarks and Random Forest Regression , 2004 .

[64]  S. Lavorel,et al.  Effects of restricting environmental range of data to project current and future species distributions , 2004 .

[65]  K. Saarinen,et al.  Population trends of Finnish butterflies (Lepidoptera: Hesperioidea, Papilionoidea) in 1991–2000 , 2003, Biodiversity & Conservation.

[66]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[67]  S. Lavorel,et al.  Generalized models vs. classification tree analysis: Predicting spatial distributions of plant species at different scales , 2003 .

[68]  R. Haines-Young,et al.  Species presence in fragmented landscapes: modelling of species requirements at the national level , 2002 .

[69]  J. Elith,et al.  Predictions and their validation: Rare plants in the Central Highlands, Victoria, Australia , 2002 .

[70]  Ari Venäläinen,et al.  Meteorological data for agricultural applications , 2002 .

[71]  David R. B. Stockwell,et al.  The GARP modelling system: problems and solutions to automated spatial prediction , 1999, Int. J. Geogr. Inf. Sci..

[72]  John Bell,et al.  A review of methods for the assessment of prediction errors in conservation presence/absence models , 1997, Environmental Conservation.

[73]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.