Species distribution modeling in the cloud

Species distribution modeling is a process aiming at computationally predicting the distribution of species in geographic areas on the basis of environmental parameters including climate data. Such a quantitative approach has a lot of potentialities in many areas that include setting up conservation priorities, testing biogeographic hypotheses, and assessing the impact of accelerated land use. To further promote the diffusion of such an approach, it is fundamental to develop a flexible, comprehensive, and robust environment capable of enabling practitioners and communities of practice to produce species distribution models more efficiently. A promising way to build such an environment is offered by modern infrastructures promoting the sharing of resources, including hardware, software, data, and services. This paper describes an approach to species distribution modeling based on a Hybrid Data Infrastructure that can offer a rich array of data and data management services by leveraging other infrastructures (including Cloud). It discusses the whole set of services needed to support the phases of such a complex process including access to occurrence records and environmental parameters and the processing of such information to predict the probability of a species’ occurrence in given areas.Copyright © 2013 John Wiley & Sons, Ltd.

[1]  Jim Graham,et al.  A global organism detection and monitoring system for non-native species , 2007, Ecol. Informatics.

[2]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[3]  Leilani Battle,et al.  Database-as-a-Service for Long-Tail Science , 2011, SSDBM.

[4]  Rick Cattell,et al.  Scalable SQL and NoSQL data stores , 2011, SGMD.

[5]  Manjula Patel,et al.  Application Profiles: Mixing and Matching Metadata Schemas , 2000 .

[6]  Ryszard Kowalczyk,et al.  Pure exchange markets for resource sharing in federated clouds , 2010, Concurr. Comput. Pract. Exp..

[7]  Pasquale Pagano,et al.  Managing Big Data through Hybrid Data Infrastructures , 2012, ERCIM News.

[8]  Stefano Nativi,et al.  The Brokering Approach for Multidisciplinary Interoperability: A Position Paper , 2012, Int. J. Spatial Data Infrastructures Res..

[9]  Yon Dohn Chung,et al.  Parallel data processing with MapReduce: a survey , 2012, SGMD.

[10]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[11]  Robert P. Anderson,et al.  Ecological Niches and Geographic Distributions , 2011 .

[12]  Pasquale Pagano,et al.  gCube: A Service-Oriented Application Framework on the Grid , 2008, ERCIM News.

[13]  Dhanji R. Prasanna,et al.  Dependency Injection , 2009 .

[14]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[15]  J. Elith,et al.  Species Distribution Models: Ecological Explanation and Prediction Across Space and Time , 2009 .

[16]  Antoine Guisan,et al.  Predictive habitat distribution models in ecology , 2000 .

[17]  J. Chris Anderson,et al.  CouchDB: The Definitive Guide , 2010 .

[18]  Daniel Atkins,et al.  Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure , 2003 .

[19]  Costantino Thanos A Vision for Global Research Data Infrastructures , 2013, Data Sci. J..

[20]  Peter Brewer,et al.  openModeller: a generic approach to species’ potential distribution modelling , 2011, GeoInformatica.

[21]  Pasquale Pagano,et al.  Making Virtual Research Environments in the Cloud a Reality: the gCube Approach , 2010, ERCIM News.

[22]  Rob Davies,et al.  ActiveMQ in Action , 2011 .

[23]  Giles M. Foody,et al.  An overview of recent remote sensing and GIS based research in ecological informatics , 2011, Ecol. Informatics.

[24]  F. Grassle The Ocean Biogeographic Information System (OBIS): An On-line, Worldwide Atlas for Accessing, Modeling and Mapping Marine Biological Data in a Multidimensional Geographic Context , 2000 .

[25]  Yong Zhao,et al.  Cloud Computing and Grid Computing 360-Degree Compared , 2008, GCE 2008.

[26]  Javier Nogueras-Iso,et al.  OGC Catalog Services: a key element for the development of Spatial Data Infrastructures , 2005, Comput. Geosci..

[27]  Walter Jetz,et al.  Integrating biodiversity distribution knowledge: toward a global map of life. , 2012, Trends in ecology & evolution.

[28]  Patrice-Emmanuel Schmitz The European Union Public Licence (EUPL) , 2013 .

[29]  Ian T. Foster,et al.  Software as a service for data scientists , 2012, Commun. ACM.

[30]  W. Thuiller,et al.  Predicting species distribution: offering more than simple habitat models. , 2005, Ecology letters.

[31]  D. Pauly,et al.  Mapping world-wide distributions of marine mammal species using a relative environmental suitability (RES) model , 2006 .

[32]  David R. B. Stockwell,et al.  The use of the GARP genetic algorithm and internet grid computing in the Lifemapper world atlas of species biodiversity , 2005, ArXiv.

[33]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[34]  Raouf Boutaba,et al.  Cloud computing: state-of-the-art and research challenges , 2010, Journal of Internet Services and Applications.

[35]  Tony Rees,et al.  "C-Squares", a New Spatial Indexing System and its Applicability to the Description of Oceanographic Datasets , 2003 .

[36]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[37]  Ben Scott,et al.  Supporting Red List threat assessments with GeoCAT: geospatial conservation assessment tool , 2011, ZooKeys.

[38]  Regina O. Obe,et al.  PostGIS in Action , 2011 .

[39]  A. D. Meglio,et al.  Programming the Grid with gLite , 2006 .

[40]  Stefano Nativi,et al.  Biodiversity and climate change use scenarios framework for the GEOSS interoperability pilot process , 2009, Ecol. Informatics.

[41]  Leonardo Candela Data Use – Virtual Research Environments , 2012 .