The bien r package: A tool to access the Botanical Information and Ecology Network (BIEN) database

There is an urgent need for large‐scale botanical data to improve our understanding of community assembly, coexistence, biogeography, evolution, and many other fundamental biological processes. Understanding these processes is critical for predicting and handling human‐biodiversity interactions and global change dynamics such as food and energy security, ecosystem services, climate change, and species invasions. The Botanical Information and Ecology Network (BIEN) database comprises an unprecedented wealth of cleaned and standardised botanical data, containing roughly 81 million occurrence records from c. 375,000 species, c. 915,000 trait observations across 28 traits from c. 93,000 species, and co‐occurrence records from 110,000 ecological plots globally, as well as 100,000 range maps and 100 replicated phylogenies (each containing 81,274 species) for New World species. Here, we describe an r package that provides easy access to these data. The bien r package allows users to access the multiple types of data in the BIEN database. Functions in this package query the BIEN database by turning user inputs into optimised PostgreSQL functions. Function names follow a convention designed to make it easy to understand what each function does. We have also developed a protocol for providing customised citations and herbarium acknowledgements for data downloaded through the bien r package. The development of the BIEN database represents a significant achievement in biological data integration, cleaning and standardization. Likewise, the bien r package represents an important tool for open science that makes the BIEN database freely and easily accessible to everyone.

[1]  Mark Schildhauer,et al.  Cyberinfrastructure for an integrated botanical information network to investigate the ecological impacts of global climate change on plant biodiversity , 2016 .

[2]  K. Schulz TraitBank: An open digital repository for organism traits , 2016 .

[3]  S. K. Morgan Ernest,et al.  An amniote life-history database to perform comparative analyses with birds, mammals, and reptiles , 2015 .

[4]  J. Svenning,et al.  Historic and prehistoric human-driven extinctions have reshaped global mammal diversity patterns , 2015, bioRxiv.

[5]  Cody E. Hinchliff,et al.  Some Limitations of Public Sequence Data for Phylogenetic Inference (in Plants) , 2014, PloS one.

[6]  John A. Silander,et al.  A comparison of Maxlike and Maxent for modelling species distributions , 2014 .

[7]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[8]  Matthew J. Smith,et al.  Protected areas network is not adequate to protect a critically endangered East Africa Chelonian: Modelling distribution of pancake tortoise, Malacochersus tornieri under current and future climates , 2013, bioRxiv.

[9]  Zhenyuan Lu,et al.  The taxonomic name resolution service: an online tool for automated standardization of plant names , 2013, BMC Bioinformatics.

[10]  W. Jetz,et al.  The global diversity of birds in space and time , 2012, Nature.

[11]  Brian C. O'Meara,et al.  treePL: divergence time estimation using penalized likelihood for large phylogenies , 2012, Bioinform..

[12]  G Gentile,et al.  Conolophus marthae. In: IUCN 2012. IUCN Red List of Threatened Species. Version 2012.1. . , 2012 .

[13]  S. Higgins,et al.  TRY – a global database of plant traits , 2011, Global Change Biology.

[14]  J. Ragle,et al.  IUCN Red List of Threatened Species , 2010 .

[15]  Jens-Christian Svenning,et al.  Determinants of palm species distributions across Africa: the relative roles of climate, non‐climatic environmental factors, and spatial constraints , 2010 .

[16]  M. Donoghue,et al.  Mega-phylogeny approach for comparative biology: an alternative to supertree and supermatrix approaches , 2009, BMC Evolutionary Biology.

[17]  J. Diniz‐Filho,et al.  Spatial analysis improves species distribution modelling during range expansion , 2008, Biology Letters.

[18]  Kate E. Jones,et al.  The delayed rise of present-day mammals , 1990, Nature.

[19]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[20]  Robert P. Anderson,et al.  Maximum entropy modeling of species geographic distributions , 2006 .