Akaike information criterion should not be a "test" of geographical prediction accuracy in ecological niche modelling

Abstract Model complexity in ecological niche modelling has been recently considered as an important issue that might affect model performance. New methodological developments have implemented the Akaike information criterion (AIC) to capture model complexity in the Maxent algorithm model. AIC is calculated based on the number of parameters and likelihoods of continuous raw outputs. ENMeval R package allows users to perform a species-specific tuning of Maxent settings running models with different combinations of regularization multiplier and feature classes and finally, all these models are compared using AIC corrected for small sample size. This approach is focused to find the “best” model parametrization and it is thought to maximize the model complexity and therefore, its predictability. We found that most niche modelling studies examined by us (68%) tend to consider AIC as a criterion of predictive accuracy in geographical distribution. In other words, AIC is used as a criterion to choose those models with the highest capacity to discriminate between presences and absences. However, the link between AIC and geographical predictive accuracy has not been tested so far. Here, we evaluated this relationship using a set of simulated (virtual) species. We created a set of nine virtual species with different ecological and geographical traits (e.g., niche position, niche breadth, range size) and generated different sets of true presences and absences data across geography. We built a set of models using Maxent algorithm with different regularization values and features schemes and calculated AIC values for each model. For each model, we obtained binary predictions using different threshold criteria and validated using independent presence and absences data. We correlated AIC values against standard validation metrics (e.g., Kappa, TSS) and the number of pixels correctly predicted as presences and absences. We did not find a correlation between AIC values and predictive accuracy from validation metrics. In general, those models with the lowest AIC values tend to generate geographical predictions with high commission and omission errors. The results were consistent across all species simulated. Finally, we suggest that AIC should not be used if users are interested in prediction more than explanation in ecological niche modelling.

[1]  Jonathon C. Dunn,et al.  Distribution and habitat associations of the critically endangered bird species of São Tomé Island (Gulf of Guinea) , 2017 .

[2]  Jennifer A. Miller,et al.  Mapping Species Distributions: Spatial Inference and Prediction , 2010 .

[3]  Bassett Maguire,,et al.  Niche Response Structure and the Analytical Potentials of Its Relationship to the Habitat , 1973, The American Naturalist.

[4]  A. Peterson,et al.  No silver bullets in correlative ecological niche modelling: insights from testing among many potential algorithms for niche estimation , 2015 .

[5]  Jorge Soberón,et al.  Niches and distributional areas: Concepts, methods, and assumptions , 2009, Proceedings of the National Academy of Sciences.

[6]  P. Marquet,et al.  Comparing the relative contributions of biotic and abiotic factors as mediators of species’ distributions , 2013 .

[7]  Robert P. Anderson,et al.  The challenge of modeling niches and distributions for data‐poor species: a comprehensive approach to model complexity , 2018 .

[8]  David R. B. Stockwell,et al.  The GARP modelling system: problems and solutions to automated spatial prediction , 1999, Int. J. Geogr. Inf. Sci..

[9]  Robert P. Anderson,et al.  Ecological Niches and Geographic Distributions , 2011 .

[10]  M. Araújo,et al.  Uses and misuses of bioclimatic envelope modeling. , 2012, Ecology.

[11]  Robert P. Anderson,et al.  Toward ecologically realistic predictions of species distributions: A cross‐time example from tropical montane cloud forests , 2018, Global change biology.

[12]  Robert P. Anderson,et al.  Estimating optimal complexity for ecological niche models: A jackknife approach for species with small sample sizes , 2013 .

[13]  Dan L Warren,et al.  Ecological niche modeling in Maxent: the importance of model complexity and the performance of model selection criteria. , 2011, Ecological applications : a publication of the Ecological Society of America.

[14]  Miroslav Dudík,et al.  Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation , 2008 .

[15]  J. L. Parra,et al.  Very high resolution interpolated climate surfaces for global land areas , 2005 .

[16]  M. Araújo,et al.  The effects of model and data complexity on predictions from species distributions models , 2016 .

[17]  T. Dawson,et al.  Selecting thresholds of occurrence in the prediction of species distributions , 2005 .

[18]  R. Holt Bringing the Hutchinsonian niche into the 21st century: Ecological and evolutionary perspectives , 2009, Proceedings of the National Academy of Sciences.

[19]  R. Real,et al.  AUC: a misleading measure of the performance of predictive distribution models , 2008 .

[20]  Matthew J. Smith,et al.  Protected areas network is not adequate to protect a critically endangered East Africa Chelonian: Modelling distribution of pancake tortoise, Malacochersus tornieri under current and future climates , 2013, bioRxiv.

[21]  Jorge Soberón Niche and area of distribution modeling: a population ecology perspective , 2010 .

[22]  Jorge Soberón,et al.  Sobre la relación entre idoneidad del hábitat y la abundancia poblacional bajo diferentes escenarios de dispersión , 2016 .

[23]  D. Chessel,et al.  ECOLOGICAL-NICHE FACTOR ANALYSIS: HOW TO COMPUTE HABITAT-SUITABILITY MAPS WITHOUT ABSENCE DATA? , 2002 .

[24]  Robert A. Boria,et al.  ENMeval: An R package for conducting spatially independent evaluations and estimating optimal model complexity for Maxent ecological niche models , 2014 .

[25]  Robert P. Anderson,et al.  Maximum entropy modeling of species geographic distributions , 2006 .

[26]  Dan L. Warren,et al.  Incorporating model complexity and spatial sampling bias into ecological niche models of climate change risks faced by 90 California vertebrate species of concern , 2014 .

[27]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[28]  M. Araújo,et al.  An evaluation of methods for modelling species distributions , 2004 .

[29]  Trevor Hastie,et al.  A statistical explanation of MaxEnt for ecologists , 2011 .

[30]  Omri Allouche,et al.  Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS) , 2006 .

[31]  Jennifer A. Miller,et al.  Virtual species distribution models , 2014 .

[32]  Robert P. Anderson,et al.  Making better Maxent models of species distributions: complexity, overfitting and evaluation , 2014 .

[33]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[34]  H. Pulliam On the relationship between niche and distribution , 2000 .

[35]  Rubén G. Mateo,et al.  Impact of model complexity on cross-temporal transferability in Maxent species distribution models: An assessment using paleobotanical data , 2015 .

[36]  John Bell,et al.  A review of methods for the assessment of prediction errors in conservation presence/absence models , 1997, Environmental Conservation.

[37]  Ans Mouton,et al.  Ecological relevance of' performance criteria for species distribution models , 2010 .

[38]  Terje Gobakken,et al.  How important are choice of model selection method and spatial autocorrelation of presence data for distribution modelling by MaxEnt , 2016 .

[39]  A. Peterson,et al.  Species Distribution Modeling and Ecological Niche Modeling: Getting the Concepts Right , 2012 .