openModeller: a generic approach to species’ potential distribution modelling

Species’ potential distribution modelling is the process of building a representation of the fundamental ecological requirements for a species and extrapolating these requirements into a geographical region. The importance of being able to predict the distribution of species is currently highlighted by issues like global climate change, public health problems caused by disease vectors, anthropogenic impacts that can lead to massive species extinction, among other challenges. There are several computational approaches that can be used to generate potential distribution models, each achieving optimal results under different conditions. However, the existing software packages available for this purpose typically implement a single algorithm, and each software package presents a new learning curve to the user. Whenever new software is developed for species’ potential distribution modelling, significant duplication of effort results because many feature requirements are shared between the different packages. Additionally, data preparation and comparison between algorithms becomes difficult when using separate software applications, since each application has different data input and output capabilities. This paper describes a generic approach for building a single computing framework capable of handling different data formats and multiple algorithms that can be used in potential distribution modelling. The ideas described in this paper have been implemented in a free and open source software package called openModeller. The main concepts of species’ potential distribution modelling are also explained and an example use case illustrates potential distribution maps generated by the framework.

[1]  J. A. Ratter,et al.  Analysis of the floristic composition of the Brazilian cerrado vegetation II: Comparison of the woody vegetation of 98 areas , 1996 .

[2]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[3]  Michael Drielsma,et al.  Extended statistical approaches to modelling spatial pattern in biodiversity in northeast New South Wales. II. Community-level modelling , 2002, Biodiversity & Conservation.

[4]  Susan A. Livingston,et al.  HABITAT MODELS FOR NESTING BALD EAGLES IN MAINE , 1990 .

[5]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[6]  J. Nichols,et al.  Population dynamics of Microtus pennsylvanicus in corridor-linked patches , 2001 .

[7]  M. Goodchild,et al.  Geographic Information Systems and Science (second edition) , 2001 .

[8]  J. L. Parra,et al.  Very high resolution interpolated climate surfaces for global land areas , 2005 .

[9]  Arthur D Chapman,et al.  Environmental Information: Placing Biodiversity Phenomena in an Ecological and Environmental Context , 2005 .

[10]  Barbara R Stein,et al.  Mammals of the World: MaNIS as an example of data integration in a distributed network environment , 2004 .

[11]  Robert P. Anderson,et al.  Maximum entropy modeling of species geographic distributions , 2006 .

[12]  Renato De Giovanni,et al.  GLOBAL BIODIVERSITY INFORMATICS: SETTING THE SCENE FOR A "NEW WORLD" OF ECOLOGICAL MODELING , 2004 .

[13]  G. Carpenter,et al.  DOMAIN: a flexible modelling procedure for mapping potential distributions of plants and animals , 1993, Biodiversity & Conservation.

[14]  Thomas C. Edwards,et al.  Modeling spatially explicit forest structural attributes using Generalized Additive Models , 2001 .

[15]  T. Dawson,et al.  SPECIES: A Spatial Evaluation of Climate Impact on the Envelope of Species , 2002 .

[16]  Jeffrey W. White,et al.  Interpolation techniques for climate variables , 1999 .

[17]  R. Haight,et al.  A Regional Landscape Analysis and Prediction of Favorable Gray Wolf Habitat in the Northern Great Lakes Region , 1995 .

[18]  Ling Bian,et al.  GIS modeling of elk calving habitat in a prairie environment with statistics , 1997 .

[19]  N. M. Kelly,et al.  Predictive mapping for management and conservation of seagrass beds in North Carolina , 2001 .

[20]  Trevor Hastie,et al.  Generalized linear and generalized additive models in studies of species distributions: setting the scene , 2002 .

[21]  Robert P. Anderson,et al.  Evaluating predictive models of species’ distributions: criteria for selecting optimal models , 2003 .

[22]  David R. B. Stockwell,et al.  Induction of sets of rules from animal distribution data: a robust and informative method of data analysis , 1992 .

[23]  N. M. Kelly,et al.  MODELING SEAGRASS LANDSCAPE PATTERN AND ASSOCIATED ECOLOGICAL ATTRIBUTES , 2002 .

[24]  Tim Sutton,et al.  How Global Is the Global Biodiversity Information Facility? , 2007, PloS one.

[25]  A. Peterson,et al.  INTERPRETATION OF MODELS OF FUNDAMENTAL ECOLOGICAL NICHES AND SPECIES' DISTRIBUTIONAL AREAS , 2005 .

[26]  R. Guralnick,et al.  BioGeomancer: Automated Georeferencing to Map the World's Biodiversity Data , 2006, PLoS biology.

[27]  A. Townsend Peterson,et al.  Transferability and model evaluation in ecological niche modeling: a comparison of GARP and Maxent , 2007 .

[28]  Arthur Chapman,et al.  © 2005, Global Biodiversity Information Facility Material in this publication is free to use, with proper attribution. Recommended citation format: Chapman, A. D. 2005. Principles of Data Quality, version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen. , 2005 .

[29]  W. Thuiller BIOMOD – optimizing predictions of species distributions and projecting potential future shifts under global change , 2003 .

[30]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[31]  M. Robertson,et al.  A PCA‐based modelling technique for predicting environmental suitability for organisms from presence records , 2001 .

[32]  C. Yesson,et al.  Phyloclimatic modeling: combining phylogenetics and bioclimatic modeling. , 2006, Systematic biology.

[33]  T. Hastie,et al.  Variation in demersal fish species richness in the oceans surrounding New Zealand: an analysis using boosted regression trees , 2006 .

[34]  A. Peterson Uses and requirements of ecological niche models and related distributional models , 2006 .

[35]  A. Townsend Peterson,et al.  Novel methods improve prediction of species' distributions from occurrence data , 2006 .

[36]  Maggi Kelly,et al.  Support vector machines for predicting distribution of Sudden Oak Death in California , 2005 .

[37]  Huan Liu,et al.  Book review: Machine Learning, Neural and Statistical Classification Edited by D. Michie, D.J. Spiegelhalter and C.C. Taylor (Ellis Horwood Limited, 1994) , 1996, SGAR.

[38]  C. Yesson,et al.  A phyloclimatic study of Cyclamen , 2006, BMC Evolutionary Biology.

[39]  R. Macarthur Mathematical Ecology and Its Place among the Sciences. (Book Reviews: Geographical Ecology. Patterns in the Distribution of Species) , 1974 .

[40]  M. Goodchild,et al.  Geographic Information Systems and Science (second edition) , 2005 .

[41]  Robert P. Anderson,et al.  Using niche-based GIS modeling to test geographic predictions of competitive exclusion and competitive release in South American pocket mice , 2002 .

[42]  A. Townsend Peterson,et al.  A global distributed biodiversity information network: building the world museum , 2003 .

[43]  M. Kearney,et al.  Mechanistic niche modelling: combining physiological and spatial data to predict species' ranges. , 2009, Ecology letters.

[44]  A. Fielding,et al.  Testing the Generality of Bird‐Habitat Models , 1995 .

[45]  C. Furlanello,et al.  Predicting habitat suitability with machine learning models: The potential area of Pinus sylvestris L. in the Iberian Peninsula , 2006 .

[46]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[47]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[48]  Ángel M. Felicísimo Modeling the Potential Distribution of Forests with a GIS , 2002 .

[49]  P. Raven,et al.  Taxonomy: Impediment or Expedient? , 2004, Science.

[50]  Chris J. Johnson,et al.  An evaluation of mapped species distribution models used for conservation planning , 2005, Environmental Conservation.

[51]  Robert P Guralnick,et al.  Towards a collaborative, global infrastructure for biodiversity assessment , 2007, Ecology letters.

[52]  A. Peterson,et al.  Biodiversity informatics: managing and applying primary biodiversity data. , 2004, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[53]  S. Manel,et al.  Alternative methods for predicting species distribution: an illustration with Himalayan river birds , 1999 .

[54]  Giselda Durigan,et al.  THE VEGETATION OF PRIORITY AREAS FOR CERRADO CONSERVATION IN SÃO PAULO STATE, BRAZIL , 2003 .