Enhancing DNA barcode reference libraries by harvesting terrestrial arthropods at the Smithsonian's National Museum of Natural History

The use of DNA barcoding has revolutionised biodiversity science, but its application depends on the existence of comprehensive and reliable reference libraries. For many poorly known taxa, such reference sequences are missing even at higher-level taxonomic scales. We harvested the collections of the Smithsonian’s National Museum of Natural History (USNM) to generate DNA barcoding sequences for genera of terrestrial arthropods previously not recorded in one or more major public sequence databases. Our workflow used a mix of Sanger and Next-Generation Sequencing (NGS) approaches to maximise sequence recovery while ensuring affordable cost. In total, COI sequences were obtained for 5,686 specimens belonging to 3,737 determined species in 3,886 genera and 205 families distributed in 137 countries. Success rates varied widely according to collection data and focal taxon. NGS helped recover sequences of specimens that failed a previous run of Sanger sequencing. Success rates and the optimal balance between Sanger and NGS are the most important drivers to maximise output and minimise cost in future projects. The corresponding sequence and taxonomic data can be accessed through the Barcode of Life Data System, GenBank, the Global Biodiversity Information Facility, the Global Genome Biodiversity Network Data Portal and the NMNH data portal.

[1]  Benjamin E. Carter,et al.  Bird nests as botanical time capsules: DNA barcoding identifies the contents of contemporary and historical nests , 2021, PloS one.

[2]  David G. Mann,et al.  Metadata standards and practical guidelines for specimen and DNA curation when building barcode reference libraries for aquatic life , 2021, Metabarcoding and Metagenomics.

[3]  P. Hebert,et al.  A SMRT approach for targeted amplicon sequencing of museum specimens (Lepidoptera)—patterns of nucleotide misincorporation , 2021, PeerJ.

[4]  P. Hebert,et al.  Phylogenetic reassignment of basal cyclostome braconid parasitoid wasps (Hymenoptera) with description of a new, enigmatic Afrotropical tribe with a highly anomalous 28S D2 secondary structure , 2020 .

[5]  V. Bafna,et al.  Beyond DNA barcoding: The unrealized potential of genome skim data in sample identification , 2020, Molecular ecology.

[6]  C. Bouget,et al.  The Challenge of DNA Barcoding Saproxylic Beetles in Natural History Collections—Exploring the Potential of Parallel Multiplex Sequencing With Illumina MiSeq , 2019, Front. Ecol. Evol..

[7]  Jeremy R. deWaard,et al.  A reference library for Canadian invertebrates with 1.5 million barcodes, voucher specimens, and DNA samples , 2019, Scientific Data.

[8]  Alexander M. Weigand,et al.  DNA barcode reference libraries for the monitoring of aquatic biota in Europe: Gap-analysis and recommendations for future work , 2019, bioRxiv.

[9]  N. Stork,et al.  How Many Species of Insects and Other Terrestrial Arthropods Are There on Earth? , 2018, Annual review of entomology.

[10]  T. Decaëns,et al.  A reference library of DNA barcodes for the earthworms from Upper Normandy: Biodiversity assessment, new records, potential cases of cryptic diversity and ongoing speciation , 2017 .

[11]  F. Leese,et al.  A DNA barcode library for Germany′s mayflies, stoneflies and caddisflies (Ephemeroptera, Plecoptera and Trichoptera) , 2017, Molecular ecology resources.

[12]  Xin Zhou,et al.  The Global Genome Biodiversity Network (GGBN) Data Standard specification , 2016, Database J. Biol. Databases Curation.

[13]  P. Hebert,et al.  Assessing DNA Barcodes for Species Identification in North American Reptiles and Amphibians in Natural History Collections , 2016, PloS one.

[14]  Pierre Taberlet,et al.  From barcodes to genomes: extending the concept of DNA barcoding , 2016, Molecular ecology.

[15]  Jeremy R. deWaard,et al.  DNA barcodes from century‐old type specimens using next‐generation sequencing , 2016, Molecular ecology resources.

[16]  F. Glaw,et al.  Comprehensive DNA barcoding of the herpetofauna of Germany , 2016, Molecular ecology resources.

[17]  S. Dodsworth,et al.  Genome skimming for next-generation biodiversity analysis. , 2015, Trends in plant science.

[18]  A. Mitchell Collecting in collections: a PCR strategy and primer set for DNA barcoding of decades‐old dried museum specimens , 2015, Molecular ecology resources.

[19]  M. Gossner,et al.  Building-Up of a DNA Barcode Library for True Bugs (Insecta: Hemiptera: Heteroptera) of Germany Reveals Taxonomic Uncertainties and Surprises , 2014, PloS one.

[20]  Gabriele Dröge,et al.  The Global Genome Biodiversity Network (GGBN) Data Portal , 2013, Nucleic Acids Res..

[21]  Beth Mantle,et al.  A DNA ‘Barcode Blitz’: Rapid Digitization and Sequencing of a Natural History Collection , 2013, PloS one.

[22]  N. Puillandre,et al.  New taxonomy and old collections: integrating DNA barcoding into the collection curation process , 2012, Molecular ecology resources.

[23]  D. Lees,et al.  DNA mini‐barcodes in taxonomic assignment: a morphologically unique new homoneurous moth clade from the Indian Himalayas described in Micropterix (Lepidoptera, Micropterigidae) , 2010 .

[24]  P. Hebert,et al.  bold: The Barcode of Life Data System (http://www.barcodinglife.org) , 2007, Molecular ecology notes.

[25]  Jeremy R. deWaard,et al.  An inexpensive, automation-friendly protocol for recovering high-quality DNA , 2006 .

[26]  R. Hanner,et al.  DNA Barcoding, species delineation and taxonomy: a historical perspective , 2015 .