A Computational Pipeline for the IUCN Risk Assessment for Meso-American Reef Ecosystem

Coral reefs are of global economic and biological significance but are subject to increasing threats. As a result, it is essential to understand the risk of coral reef ecosystem collapse and to develop assessment process for those ecosystems. The International Union for Conservation of Nature (IUCN) Red List of Ecosystem (RLE) is a framework to assess the vulnerability of an ecosystem. Importantly, the assessment processes need to be repeatable as new monitoring data arises. The repeatability will also enhance transparency. In this paper, we discuss the evolution of a computational pipeline for risk assessment of the Meso-American reef ecosystem, a diverse reef ecosystem located in the Caribbean, with the focus on improving the execution time starting from sequential and parallel implementation and finally using Apache Spark. The final form of the pipeline is a scientific workflow to improve its repeatability and reproducibility.

[1]  David Abramson,et al.  Integration of modern data management practice with scientific workflows , 2012, 2012 IEEE 8th International Conference on E-Science.

[2]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[3]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[4]  David Abramson,et al.  Nimrod/K: Towards massively parallel dynamic Grid workflows , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[5]  T. Essington,et al.  Pitfalls and guidelines for “recycling” models for ecosystem-based fisheries management: evaluating model suitability for forage fish fisheries , 2014 .

[6]  Jianwu Wang,et al.  A Framework for Distributed Data-Parallel Execution in the Kepler Scientific Workflow System , 2012, ICCS.

[7]  Mark A. Burgman,et al.  Scientific Foundations for an IUCN Red List of Ecosystems , 2013, PloS one.

[8]  Tak Fung,et al.  Regional-scale scenario modeling for coral reefs: a decision support tool to inform management of a complex system. , 2011, Ecological applications : a publication of the Ecological Society of America.

[9]  Cláudio T. Silva,et al.  VisTrails: visualization meets data management , 2006, SIGMOD Conference.

[10]  Fred Wells,et al.  Marine Biodiversity Hotspots and Conservation Priorities for Tropical Reefs , 2002, Science.

[11]  D. Keith,et al.  Guidelines for the application of IUCN Red List of Ecosystems categories and criteria , 2015 .

[12]  David Abramson,et al.  High performance parametric modeling with Nimrod/G: killer application for the global grid? , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[13]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[14]  Elizabeth A. Fulton,et al.  Regional-scale scenario analysis for the Meso-American Reef system: Modelling coral reef futures under multiple stressors , 2011 .

[15]  D. Keith Assessing and managing risks to ecosystem biodiversity , 2015 .

[16]  David Abramson,et al.  WorkWays: interacting with scientific workflows , 2015, Concurr. Comput. Pract. Exp..

[17]  Elizabeth A. Fulton,et al.  Characterizing sensitivity and uncertainty in a multiscale model of a complex coral reef system , 2011 .

[18]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[19]  R. Steneck,et al.  Coral Reefs Under Rapid Climate Change and Ocean Acidification , 2007, Science.

[20]  Elizabeth A. Fulton,et al.  How models can support ecosystem-based management of coral reefs , 2015 .

[21]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[22]  Andrew Lewis,et al.  An evolutionary programming algorithm for multi-objective optimisation , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[23]  Jing Hua,et al.  Service-Oriented Architecture for VIEW: A Visual Scientific Workflow Management System , 2008, 2008 IEEE International Conference on Services Computing.

[24]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[25]  Minh Ngoc Dinh,et al.  Using multiple lines of evidence to assess the risk of ecosystem collapse , 2017, Proceedings of the Royal Society B: Biological Sciences.

[26]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[27]  Carole A. Goble,et al.  Taverna, Reloaded , 2010, SSDBM.

[28]  David Abramson,et al.  Mixing Grids and Clouds: High-Throughput Science Using the Nimrod Tool Family , 2010, Cloud Computing.

[29]  Siddeswara Guru,et al.  Development of a cloud-based platform for reproducible science: A case study of an IUCN Red List of Ecosystems Assessment , 2016, Ecol. Informatics.

[30]  Ben Collen,et al.  Establishing IUCN Red List Criteria for Threatened Ecosystems , 2010, Conservation biology : the journal of the Society for Conservation Biology.