Digitization and the Future of Natural History Collections

Natural history collections (NHCs) are the foundation of historical baselines for assessing anthropogenic impacts on biodiversity. Along these lines, the online mobilization of specimens via digitization–the conversion of specimen data into accessible digital content–has greatly expanded the use of NHC collections across a diversity of disciplines. We broaden the current vision of digitization (Digitization 1.0)–whereby specimens are digitized within NHCs–to include new approaches that rely on digitized products rather than the physical specimen (Digitization 2.0). Digitization 2.0 builds upon the data, workflows, and infrastructure produced by Digitization 1.0 to create digital-only workflows that facilitate digitization, curation, and data linkages, thus returning value to physical specimens by creating new layers of annotation, empowering a global community, and developing automated approaches to advance biodiversity discovery and conservation. These efforts will transform large-scale biodiversity assessments to address fundamental questions including those pertaining to critical modern issues of global change.

[1]  J. Loo,et al.  Defining conservation priorities for plant taxa in southeastern New Brunswick, Canada using herbarium records , 1998 .

[2]  R. Freckleton,et al.  Declines in the numbers of amateur and professional taxonomists: implications for conservation , 2002 .

[3]  A. F. O'connell,et al.  Contribution of Natural History Collection Data to Biodiversity Assessment in National Parks , 2004 .

[4]  R. Primack,et al.  Herbarium specimens demonstrate earlier flowering times in response to warming in Boston. , 2004, American journal of botany.

[5]  J. Edwards Research and Societal Benefits of the Global Biodiversity Information Facility , 2004 .

[6]  J. L. Parra,et al.  Impact of a Century of Climate Change on Small-Mammal Communities in Yosemite National Park, USA , 2008, Science.

[7]  P. Ehrlich,et al.  Biological collections and ecological/environmental research: a review, some observations and a look to the future , 2010, Biological reviews of the Cambridge Philosophical Society.

[8]  J. Stewart,et al.  Climate Change and Biosphere Response: Unlocking the Collections Vault , 2011 .

[9]  A. Lister Natural history collections as sources of long-term datasets. , 2011, Trends in ecology & evolution.

[10]  Arfon Smith,et al.  The notes from nature tool for unlocking biodiversity records from museum records through citizen science , 2012, ZooKeys.

[11]  Robert A. Morris,et al.  Kurator: A Kepler Package for Data Curation Workflows , 2012, ICCS.

[12]  I. Kitching,et al.  Online solutions and the ‘Wallacean shortfall’: what does GBIF contribute to our knowledge of species' ranges? , 2013 .

[13]  F. Comoglio,et al.  Bayesian Inference from Count Data Using Discrete Uniform Priors , 2013, PloS one.

[14]  Hannu Saarenmaa,et al.  High‐performance digitization of natural history collections: Automated imaging lines for herbarium and insect specimens , 2014 .

[15]  Wolfgang Schwanghart,et al.  Spatial bias in the GBIF database and its effect on modeling species' geographic distributions , 2014, Ecol. Informatics.

[16]  D. Harris,et al.  Widespread mistaken identity in tropical plant collections , 2015, Current Biology.

[17]  M. Maslin,et al.  Defining the Anthropocene , 2015, Nature.

[18]  Daniel S. Park,et al.  Why close relatives make bad neighbours: phylogenetic conservatism in niche preferences and dispersal disproves Darwin's naturalization hypothesis in the thistle tribe , 2015, Molecular ecology.

[19]  Katja C. Seltmann,et al.  Accelerating the Digitization of Biodiversity Research Specimens through Online Public Participation , 2015 .

[20]  C. V. Jawahar,et al.  Deep Feature Embedding for Accurate Recognition and Retrieval of Handwritten Text , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[21]  John Wieczorek,et al.  The importance of digitized biocollections as a source of trait data and a new VertNet resource , 2016, Database J. Biol. Databases Curation.

[22]  D. Merhof,et al.  Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification , 2016, BMC Evolutionary Biology.

[23]  Gernot A. Fink,et al.  PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[24]  Shengping Zhang,et al.  Computer vision cracks the leaf code , 2016, Proceedings of the National Academy of Sciences.

[25]  Bir Bhanu,et al.  A software system for automated identification and retrieval of moth images based on wing attributes , 2016, Pattern Recognit..

[26]  Carsten Meyer,et al.  Multidimensional biases, gaps and uncertainties in global plant occurrence information. , 2016, Ecology letters.

[27]  M. Alfaro,et al.  Crowdsourced geometric morphometrics enable rapid large-scale collection and analysis of phenotypic data , 2015, bioRxiv.

[28]  Nala Rogers Museum drawers go digital. , 2016, Science.

[29]  Sarinder Kaur Dhillon,et al.  Automated plant identification using artificial neural network and support vector machine , 2017 .

[30]  Lucy D. Robinson,et al.  Contributions to conservation outcomes by natural history museum-led citizen science: Examining evidence and next steps , 2017 .

[31]  L. Gautier,et al.  The time has come for Natural History Collections to claim co-authorship of research articles , 2017 .

[32]  M. Gaudeul,et al.  The French Muséum national d’histoire naturelle vascular plant herbarium collection dataset , 2017, Scientific Data.

[33]  M. Stiassny,et al.  Digitization of museum collections holds the potential to enhance researcher diversity , 2017, Nature Ecology & Evolution.

[34]  P. Grandcolas,et al.  Taxonomic bias in biodiversity data and societal preferences , 2017, Scientific Reports.

[35]  Vicki A. Funk,et al.  Applications of deep convolutional neural networks to digitized natural history collections , 2017, Biodiversity data journal.

[36]  J. López‐Pujol,et al.  Assessing the Relevance of Herbarium Collections as Tools for Conservation Biology , 2017, The Botanical Review.

[37]  Daniel S. Park,et al.  CrowdCurio: an online crowdsourcing platform to facilitate climate change studies using herbarium specimens. , 2017, The New phytologist.

[38]  Christopher R. Cooney,et al.  Mega-evolutionary dynamics of the adaptive radiation of birds , 2017, Nature.

[39]  W. D. Stevens,et al.  Amazon plant diversity revealed by a taxonomically verified species list , 2017, Proceedings of the National Academy of Sciences.

[40]  Elizabeth R. Ellwood,et al.  Citizen science and conservation: Recommendations for a rapidly moving field , 2017 .

[41]  Timothy J. S. Whitfeld,et al.  Widespread sampling biases in herbaria revealed from large-scale digitization , 2017, bioRxiv.

[42]  Brian J. Stucky,et al.  Digitization protocol for scoring reproductive phenology from herbarium specimens of seed plants , 2018, Applications in plant sciences.

[43]  Enrique Alonso García,et al.  Towards global data products of Essential Biodiversity Variables on species traits , 2018, Nature Ecology & Evolution.

[44]  C. Davis,et al.  Large-scale digitization of herbarium specimens: Development and usage of an automated, high-throughput conveyor system , 2018 .

[45]  Daniel S. Park,et al.  Herbarium specimens reveal substantial and unexpected variation in phenological sensitivity across the eastern United States , 2018, Philosophical Transactions of the Royal Society B.

[46]  Baskar Ganapathysubramanian,et al.  Crowdsourcing image analysis for plant phenomics to generate ground truth data for machine learning , 2018, PLoS Comput. Biol..

[47]  Patrick Mäder,et al.  Automated plant species identification—Trends and future directions , 2018, PLoS Comput. Biol..

[48]  J. M. Heberling,et al.  iNaturalist as a tool to expand the research value of museum specimens , 2018, Applications in plant sciences.

[49]  L. Hogeweg,et al.  Supporting citizen scientists with automatic species identification using deep learning image recognition models , 2018 .

[50]  Emily K. Meineke,et al.  The unrealized potential of herbaria for global change biology , 2017, bioRxiv.

[51]  V. Funk Collections‐based science in the 21st Century , 2018 .

[52]  R. Guralnick,et al.  No general relationship between mass and temperature in endothermic species , 2017, bioRxiv.

[53]  Emily K. Meineke,et al.  Biological collections for understanding biodiversity in the Anthropocene , 2018, Philosophical Transactions of the Royal Society B.

[54]  S. Ellis,et al.  The history and impact of digitization and digital data mobilization on biodiversity research , 2018, Philosophical Transactions of the Royal Society B.

[55]  A. Maki,et al.  Automated Taxonomic Identification of Insects with Expert-Level Accuracy Using Effective Feature Transfer from Convolutional Networks , 2019, Systematic biology.

[56]  Elizabeth R. Ellwood,et al.  Toward a large‐scale and deep phenological stage annotation of herbarium specimens: Case studies from temperate, tropical, and equatorial floras , 2019, Applications in plant sciences.

[57]  N. Cobb,et al.  Assessment of North American arthropod collections: prospects and challenges for addressing biodiversity research , 2019, PeerJ.

[58]  Paula M. Mabee,et al.  The Extended Specimen Network: A Strategy to Enhance US Biodiversity Collections, Promote Research and Education , 2019, Bioscience.