Best practice for biodiversity data management and publication

Abstract There is increasing pressure from the scientific community, including funding agencies, journals and peers, for authors to publish the biodiversity data used in published articles and other scientific literature. This enables reproducibility of research and creates new opportunities for integrating data between research projects and analysing data in additional ways. The long-term availability of data is especially important in conservation science because field data can be costly to collect. In addition, historic data, especially on threatened species and their associated biota, become more valuable over time. This paper summarises current standards and best practices for the management and publication of biodiversity data. It includes recommendations for citing sources of species determination and standards for formatting species distribution data. Whenever possible, data should be published for inclusion in data access platforms that integrate datasets (e.g. GBIF, GenBank) and so enable new analyses and broader impact. Data centres (e.g. PANGAEA) provide added value in quality checks on data. A minimum standard recommended is that data should be permanently archived in an online, open-access repository with sufficient metadata for potential users to understand how and why they were collected.

[1]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[2]  Mark John Costello Motivating Online Publication of Data , 2009 .

[3]  Jonathan D. Ballou,et al.  Implications of different species concepts for conserving biodiversity , 2012 .

[4]  Matthew B. Jones,et al.  Challenges and Opportunities of Open Data in Ecology , 2011, Science.

[5]  Karen I. Stocks,et al.  About the Ocean Biogeographic Information System , 2007 .

[6]  Andrew Cockburn,et al.  Nest predation in New Zealand songbirds: Exotic predators, introduced prey and long-term changes in predation risk , 2012 .

[7]  Makoto Omori,et al.  Patterns of coral spawning at Akajima Island, Okinawa, Japan , 1993 .

[8]  Walter G. Berendsohn,et al.  Strategies for the sustainability of online open-access biodiversity databases , 2014 .

[9]  William K. Michener,et al.  NONGEOSPATIAL METADATA FOR THE ECOLOGICAL SCIENCES , 1997 .

[10]  Donald J. DePaolo,et al.  Reconstructing past sea surface temperatures: Correcting for diagenesis of bulk marine carbonate , 1995 .

[11]  A. Budden,et al.  Big data and the future of ecology , 2013 .

[12]  B. Vanhoorne,et al.  World Register of Marine Species , 2013 .

[13]  A. Townsend Peterson,et al.  VertNet: A New Model for Biodiversity Data Sharing , 2010, PLoS biology.

[14]  Gregor Hagedorn,et al.  Creative Commons licenses and the non-commercial condition: Implications for the re-use of biodiversity information , 2011, ZooKeys.

[15]  M. Whitlock Data archiving in ecology and evolution: best practices. , 2011, Trends in ecology & evolution.

[16]  Bert W. Hoeksema,et al.  Global Coordination and Standardisation in Marine Biodiversity through the World Register of Marine Species (WoRMS) and Related Databases , 2013, PloS one.

[17]  J. Blake,et al.  Supplementary data need to be kept in public repositories. , 2005, Nature.

[18]  Vincent S. Smith,et al.  Pensoft Data Publishing Policies and Guidelines for Biodiversity Data , 2011 .

[19]  Edward Vanden Berghe,et al.  'Ocean biodiversity informatics': a new era in marine biology research and management , 2006 .

[20]  Mark Gahegan,et al.  Biodiversity data should be published, cited, and peer reviewed. , 2013, Trends in ecology & evolution.

[21]  Martin Brändle,et al.  Do secondary forests act as refuges for old growth forest animals? Recovery of ant diversity in the Atlantic forest of Brazil , 2008 .

[22]  Don Faber-Langendoen,et al.  VegBank – a permanent, open-access archive for vegetation-plot data , 2012 .

[23]  P. Uhlir,et al.  A Contractually Reconstructed Research Commons for Scientific Data in a Highly Protectionist Intellectual Property Environment , 2003 .

[24]  Vincent Robert,et al.  The MycoBank engine and related databases , 2005 .

[25]  Q. Wheeler The New Taxonomy , 2008 .

[26]  P. Lawrence Lost in publication: how measurement harms science , 2008 .

[27]  Dirk Steinke,et al.  The FISH-BOL collaborators' protocol , 2011, Mitochondrial DNA.

[28]  Charles W. Linkem,et al.  Dragons in our midst: Phyloforensics of illegally traded Southeast Asian monitor lizards , 2013 .

[29]  Matthew Jones,et al.  Some Simple Guidelines for Effective Data Management , 2009 .

[30]  John Wieczorek,et al.  Darwin Core: An Evolving Community-Developed Biodiversity Data Standard , 2012, PloS one.

[31]  Alastair Culham,et al.  Species 2000 & ITIS Catalogue of Life: 2013 Annual Checklist [DVD] , 2013 .

[32]  Mark J. Costello,et al.  Mapping habitats in a marine reserve showed how a 30-year trophic cascade altered ecosystem structure , 2012, Biological Conservation.