Multidimensional biases, gaps and uncertainties in global plant occurrence information.

Plants are a hyperdiverse clade that plays a key role in maintaining ecological and evolutionary processes as well as human livelihoods. Biases, gaps and uncertainties in plant occurrence information remain a central problem in ecology and conservation, but these limitations remain largely unassessed globally. In this synthesis, we propose a conceptual framework for analysing gaps in information coverage, information uncertainties and biases in these metrics along taxonomic, geographical and temporal dimensions, and apply it to all c. 370 000 species of land plants. To this end, we integrated 120 million point-occurrence records with independent databases on plant taxonomy, distributions and conservation status. We find that different data limitations are prevalent in each dimension. Different metrics of information coverage and uncertainty are largely uncorrelated, and reducing taxonomic, spatial or temporal uncertainty by filtering out records would usually come at great costs to coverage. In light of these multidimensional data limitations, we discuss prospects for global plant ecological and biogeographical research, monitoring and conservation and outline critical next steps towards more effective information usage and mobilisation. Our study provides an empirical baseline for evaluating and improving global floristic knowledge, along with a conceptual framework that can be applied to study other hyperdiverse clades.

[1]  George E Schatz,et al.  Plants on the IUCN Red List: setting priorities to inform conservation. , 2009, Trends in plant science.

[2]  Joana Nogueira,et al.  Unravelling biodiversity, evolution and threats to conservation in the Sahara‐Sahel , 2014, Biological reviews of the Cambridge Philosophical Society.

[3]  Robert P. Guralnick,et al.  Georeferencing of museum collections: A review of problems and automated tools, and the methodology developed by the Mountain and Plains Spatio-Temporal Database-Informatics Initiative (Mapstedi) , 2004 .

[4]  Brody Sandel,et al.  Limited sampling hampers “big data” estimation of species richness in a tropical biodiversity hotspot , 2015, Ecology and evolution.

[5]  J. Lobo,et al.  Historical bias in biodiversity inventories affects the observed environmental niche of the species , 2008 .

[6]  Robert M. Dorazio,et al.  Accounting for imperfect detection and survey bias in statistical analysis of presence‐only data , 2014 .

[7]  S. Aitken,et al.  Whitebark pine (Pinus albicaulis) assisted migration potential: testing establishment north of the species range. , 2012, Ecological applications : a publication of the Ecological Society of America.

[8]  C. Lavoie Biological collections in an ever changing world: Herbaria as tools for biogeographical and environmental studies , 2013 .

[9]  Robert K. Colwell,et al.  Estimating terrestrial biodiversity through extrapolation. , 1994, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[10]  Richard Field,et al.  Spatial patterns of woody plant and bird diversity: functional relationships or environmental effects? , 2008 .

[11]  Neil D. Burgess,et al.  Conservation and the botanist effect , 2011 .

[12]  S. Ferrier Mapping spatial pattern in biodiversity for regional conservation planning: where to from here? , 2002, Systematic biology.

[13]  Drew W. Purves,et al.  Fine‐scale environmental variation in species distribution modelling: regression dilution, latent variables and neighbourly advice , 2011 .

[14]  B. Thiers,et al.  Index Herbariorum: a global directory of public herbaria and associated staff. New York Botanical Garden's Virtual Herbarium. , 2009 .

[15]  John H. Lawton,et al.  Correcting for variation in recording effort in analyses of diversity hotspots , 1993 .

[16]  Arturo H. Ariño,et al.  CONTENT ASSESSMENT OF THE PRIMARY BIODIVERSITY DATA PUBLISHED THROUGH GBIF NETWORK: STATUS, CHALLENGES AND POTENTIALS , 2013 .

[17]  Steven J. Phillips,et al.  WHAT MATTERS FOR PREDICTING THE OCCURRENCES OF TREES: TECHNIQUES, DATA, OR SPECIES' CHARACTERISTICS? , 2007 .

[18]  Walter Jetz,et al.  Global priorities for an effective information basis of biodiversity distributions , 2015, Nature Communications.

[19]  Walter Jetz,et al.  Integrating biodiversity distribution knowledge: toward a global map of life. , 2012, Trends in ecology & evolution.

[20]  Joslin L. Moore,et al.  The concepts of bias, precision and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimator performance , 2005 .

[21]  K. Gardens The Plant List , 2013 .

[22]  M. Silman,et al.  Keep collecting: accurate species distribution modelling requires more collections than previously thought , 2011 .

[23]  Neil D. Burgess,et al.  Funding begets biodiversity , 2011 .

[24]  W. Jetz,et al.  Global patterns and determinants of vascular plant diversity , 2007, Proceedings of the National Academy of Sciences.

[25]  Walter G. Berendsohn,et al.  The concept of "potential taxa" in databases , 1995 .

[26]  Arthur Chapman,et al.  © 2005, Global Biodiversity Information Facility Material in this publication is free to use, with proper attribution. Recommended citation format: Chapman, A. D. 2005. Principles of Data Quality, version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen. , 2005 .

[27]  Neal J. Enright,et al.  Record error and range contraction, real and imagined, in the restricted shrub Banksia hookeriana in south‐western Australia , 2007 .

[28]  Florian Jansen,et al.  Plant names in vegetation databases – a neglected source of bias , 2010 .

[29]  M. Silman,et al.  Modelling the responses of Andean and Amazonian plant species to climate change: the effects of georeferencing errors and the importance of data filtering , 2010 .

[30]  Mark Schildhauer,et al.  Habitat area and climate stability determine geographical variation in plant species range sizes , 2013, Ecology letters.

[31]  Walter G. Berendsohn,et al.  Strategies for the sustainability of online open-access biodiversity databases , 2014 .

[32]  A. Peterson,et al.  Biodiversity informatics: managing and applying primary biodiversity data. , 2004, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[33]  R. Kadmon,et al.  A SYSTEMATIC ANALYSIS OF FACTORS AFFECTING THE PERFORMANCE OF CLIMATIC ENVELOPE MODELS , 2003 .

[34]  Feike Schieving,et al.  A model of botanical collectors' behavior in the field: never the same species twice. , 2011, American journal of botany.

[35]  P. Curtis,et al.  Herbarium specimens reveal the footprint of climate change on flowering trends across north-central North America , 2013, Ecology letters.

[36]  Franklin B. Schwing,et al.  The Pace of Shifting Climate in Marine and Terrestrial Ecosystems , 2011, Science.

[37]  Zhenyuan Lu,et al.  The taxonomic name resolution service: an online tool for automated standardization of plant names , 2013, BMC Bioinformatics.

[38]  P. Ehrlich,et al.  Biological collections and ecological/environmental research: a review, some observations and a look to the future , 2010, Biological reviews of the Cambridge Philosophical Society.

[39]  M. Araújo,et al.  How can a knowledge of the past help to conserve the future? Biodiversity conservation and the relevance of long-term ecological studies , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[40]  Ben Collen,et al.  The Tropical Biodiversity Data Gap: Addressing Disparity in Global Monitoring , 2008 .

[41]  Walter G. Berendsohn,et al.  Using geographical and taxonomic metadata to set priorities in specimen digitization , 2010 .

[42]  Alexandre Antonelli,et al.  Estimating species diversity and distribution in the era of Big Data: to what extent can we trust public databases? , 2015, Global ecology and biogeography : a journal of macroecology.

[43]  Lisa C. Bradley,et al.  Assessing the Value of Natural History Collections and Addressing Issues Regarding Long-Term Growth and Care , 2014 .

[44]  Joaquín Hortal,et al.  Mapping species distributions: living with uncertainty , 2013 .

[45]  A. Townsend Peterson,et al.  Knowledge behind conservation status decisions: Data basis for “Data Deficient” Brazilian plant species , 2014 .

[46]  Brett J Furnas,et al.  Detecting diversity: emerging methods to estimate species diversity. , 2014, Trends in ecology & evolution.

[47]  Alan Paton,et al.  Biodiversity informatics and the plant conservation baseline. , 2009, Trends in plant science.

[48]  L. Alan Prather,et al.  The Decline of Plant Collecting in the United States: A Threat to the Infrastructure of Biodiversity Studies , 2004 .

[49]  Mark V. Lomolino,et al.  Frontiers of biogeography : new directions in the geography of nature , 2004 .

[50]  Boris Schröder,et al.  How to understand species’ niches and range dynamics: a demographic research agenda for biogeography , 2012 .

[51]  Alexander N. Schmidt-Lebuhn,et al.  Non-geographic collecting biases in herbarium specimens of Australian daisies (Asteraceae) , 2013, Biodiversity and Conservation.

[52]  P. Clifford,et al.  Modifying the t test for assessing the correlation between two spatial processes , 1993 .

[53]  Henrik Andrén,et al.  Higher levels of multiple ecosystem services are found in forests with more tree species , 2013, Nature Communications.

[54]  A. Chao,et al.  Coverage-based rarefaction and extrapolation: standardizing samples by completeness rather than size. , 2012, Ecology.

[55]  J. Stewart,et al.  Climate Change and Biosphere Response: Unlocking the Collections Vault , 2011 .

[56]  James Macklin,et al.  Natural History Specimen Digitization: Challenges and Concerns , 2010 .

[57]  C. Justice,et al.  High-Resolution Global Maps of 21st-Century Forest Cover Change , 2013, Science.

[58]  Porter P. Lowry,et al.  The endemic and non-endemic vascular flora of Madagascar updated , 2013 .

[59]  Michael J. Samways,et al.  Insect species richness tracking plant species richness in a diverse flora: gall-insects in the Cape Floristic Region, South Africa , 1998, Oecologia.

[60]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[61]  H. D. Cooper,et al.  A mid-term analysis of progress toward international biodiversity targets , 2014, Science.

[62]  A. Peterson,et al.  Evidence of climatic niche shift during biological invasion. , 2007, Ecology letters.

[63]  K. Gaston,et al.  The sizes of species’ geographic ranges , 2009 .

[64]  M. Chase,et al.  Trends and concepts in fern classification. , 2014, Annals of botany.

[65]  W. Barthlott,et al.  Global patterns of plant diversity and floristic knowledge , 2005 .

[66]  P. Reich,et al.  High plant diversity is needed to maintain ecosystem services , 2011, Nature.

[67]  D. Roberts,et al.  How many herbarium specimens are needed to detect threatened species , 2011 .

[68]  B. Nelson,et al.  Endemism centres, refugia and botanical collection density in Brazilian Amazonia , 1990, Nature.

[69]  Michelle R. Leishman,et al.  Invasion hotspots for non‐native plants in Australia under current and future climates , 2012 .

[70]  Letícia Couto Garcia,et al.  Completeness of digital accessible knowledge of the plants of Brazil and priorities for survey and inventory , 2014 .

[71]  H. ter Steege,et al.  The phenology of Guyanese timber species: a compilation of a century of observations , 1991, Vegetatio.

[72]  Amy,et al.  CONTENT ASSESSMENT OF THE PRIMARY BIODIVERSITY DATA PUBLISHED THROUGH GBIF NETWORK : STATUS , CHALLENGES AND POTENTIALS , 2013 .

[73]  Wcsp World Checklist of Selected Plant Families. , 2016 .

[74]  C. Ricotta,et al.  Accounting for uncertainty when mapping species distributions: The need for maps of ignorance , 2011 .

[75]  Irene Bisang,et al.  Studies on the status of rare and endangered bryophytes in Switzerland , 1994 .

[76]  Stephan B. Munch,et al.  Using measurement error models to account for georeferencing error in species distribution models , 2016 .

[77]  K. Feeley Distributional migrations, expansions, and contractions of tropical plant species as revealed in dated herbarium records , 2012 .

[78]  D. Harris,et al.  Herbaria are a major frontier for species discovery , 2010, Proceedings of the National Academy of Sciences.

[79]  Andrew K. Skidmore,et al.  Where is positional uncertainty a problem for species distribution modelling , 2014 .

[80]  Jorge Soberón,et al.  A global perspective on decadal challenges and priorities in biodiversity informatics , 2015, BMC Ecology.

[81]  Noel H. Holmgren,et al.  Index Herbariorum: A global directory of public herbaria and associated staff , 1998 .

[82]  T. Hothorn,et al.  A Robust Procedure for Comparing Multiple Means under Heteroscedasticity in Unbalanced Designs , 2010, PloS one.

[83]  Jorge M. Lobo,et al.  Database records as a surrogate for sampling effort provide higher species richness estimations , 2008, Biodiversity and Conservation.

[84]  Georgina M. Mace,et al.  Distorted Views of Biodiversity: Spatial and Temporal Bias in Species Occurrence Data , 2010, PLoS biology.

[85]  J L Edwards,et al.  Interoperability of biodiversity databases: biodiversity information on every desktop. , 2000, Science.

[86]  Eve McDonald-Madden,et al.  Predicting species distributions for conservation decisions , 2013, Ecology letters.

[87]  Alan J. Paton From Working List to Online Flora of All Known Plants—Looking Forward with Hindsight1 , 2013 .

[88]  Kalle Ruokolainen,et al.  Analysing botanical collecting effort in Amazonia and correcting for it in species range estimation , 2007 .

[89]  Keping Ma,et al.  PAPER Environmental and socio-economic factors shaping the geography of floristic collections in China , 2014 .

[90]  Peter H. Raven,et al.  Angiosperm Biogeography and Past Continental Movements , 1974 .

[91]  Neil Brummitt,et al.  The Sampled Red List Index for Plants, phase II: ground-truthing specimen-based conservation assessments , 2015, Philosophical Transactions of the Royal Society B: Biological Sciences.

[92]  Joanna Grand,et al.  Biased data reduce efficiency and effectiveness of conservation reserve networks. , 2007, Ecology letters.

[93]  V. Funk,et al.  Testing the use of specimen collection data and GIS in biodiversity exploration and conservation decision making in Guyana , 1999, Biodiversity & Conservation.

[94]  J. Lennon,et al.  Incorporating uncertainty in predictive species distribution modelling , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[95]  W. Scott,et al.  Assessing species misidentification rates through quality assurance of vegetation monitoring , 2003, Plant Ecology.

[96]  S. Ferrier,et al.  Survey-gap analysis in expeditionary research: where do we go from here? , 2005 .

[97]  Walter Jetz,et al.  Bioclimatic and physical characterization of the world’s islands , 2013, Proceedings of the National Academy of Sciences.

[98]  Tim Sutton,et al.  How Global Is the Global Biodiversity Information Facility? , 2007, PloS one.

[99]  Robert J. Whittaker,et al.  Basic Biogeography: Estimating Biodiversity and Mapping Nature , 2011 .

[100]  Keping Ma,et al.  Geographical sampling bias in a large distributional database and its effects on species richness–environment models , 2013 .

[101]  Jan Pergl,et al.  Global exchange and accumulation of non-native plants , 2015, Nature.

[102]  Alberto Jiménez-Valverde,et al.  Limitations of Biodiversity Databases: Case Study on Seed‐Plant Diversity in Tenerife, Canary Islands , 2007, Conservation biology : the journal of the Society for Conservation Biology.

[103]  Ghillean T. Prance,et al.  Floristic Inventory of the Tropics: Where Do We Stand? , 1977 .

[104]  Piero Visconti,et al.  What spatial data do we need to develop global mammal conservation strategies? , 2011, Philosophical Transactions of the Royal Society B: Biological Sciences.

[105]  M. Sykes,et al.  Predicting global change impacts on plant species' distributions: Future challenges , 2008 .