DNA barcoding and taxonomy: dark taxa and dark texts

Both classical taxonomy and DNA barcoding are engaged in the task of digitizing the living world. Much of the taxonomic literature remains undigitized. The rise of open access publishing this century and the freeing of older literature from the shackles of copyright have greatly increased the online availability of taxonomic descriptions, but much of the literature of the mid- to late-twentieth century remains offline (‘dark texts’). DNA barcoding is generating a wealth of computable data that in many ways are much easier to work with than classical taxonomic descriptions, but many of the sequences are not identified to species level. These ‘dark taxa’ hamper the classical method of integrating biodiversity data, using shared taxonomic names. Voucher specimens are a potential common currency of both the taxonomic literature and sequence databases, and could be used to help link names, literature and sequences. An obstacle to this approach is the lack of stable, resolvable specimen identifiers. The paper concludes with an appeal for a global ‘digital dashboard’ to assess the extent to which biodiversity data are available online. This article is part of the themed issue ‘From DNA barcodes to biomes’.

[1]  M. Edwards,et al.  Reply to 'Supraspecific names of molluscs: a quantitative review' , 1993 .

[2]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[3]  L. Joppa,et al.  The population ecology and social behaviour of taxonomists. , 2011, Trends in ecology & evolution.

[4]  John Deck,et al.  The Trouble with Triplets in Biodiversity Informatics: A Data-Driven Case against Current Identifier Practices , 2014, PloS one.

[5]  B. Strasser The Experimenter's Museum: GenBank, Natural History, and the Moral Economies of Biomedicine , 2011, Isis.

[6]  Sujeevan Ratnasingham,et al.  A DNA-Based Registry for All Animal Species: The Barcode Index Number (BIN) System , 2013, PloS one.

[7]  Erez Lieberman Aiden,et al.  Uncharted: Big Data as a Lens on Human Culture , 2013 .

[8]  Bruno J. Strasser,et al.  GenBank--Natural History in the 21st Century? , 2008, Science.

[9]  Edward Gilbert,et al.  Trends in access of plant biodiversity data revealed by Google Analytics , 2014, Biodiversity data journal.

[10]  Jessie B. Kennedy Supporting Taxonomic Names in Cell and Molecular Biology Databases , 2003, OMICS.

[11]  Gregor Hagedorn,et al.  Scientific names of organisms: attribution, rights, and licensing , 2014, BMC Research Notes.

[12]  Roderic D.M. Page,et al.  BioNames: linking taxonomy, texts, and trees , 2013, PeerJ.

[13]  Arturo H. Ariño APPROACHES TO ESTIMATING THE UNIVERSE OF NATURAL HISTORY COLLECTIONS DATA , 2010 .

[14]  Constance A. Rinaldo,et al.  The Biodiversity Heritage Library: sharing biodiversity literature with the world , 2009 .

[15]  D J Patterson,et al.  Names are key to the big new biology. , 2010, Trends in ecology & evolution.

[16]  Tatiana A. Tatusova,et al.  BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata , 2011, Nucleic Acids Res..

[17]  P. Hebert,et al.  bold: The Barcode of Life Data System (http://www.barcodinglife.org) , 2007, Molecular ecology notes.

[18]  Roger Hyam,et al.  Stable citations for herbarium specimens on the internet: an illustration from a taxonomic revision of Duboscia (Malvaceae) , 2012 .

[19]  Scott Federhen,et al.  Type material in the NCBI Taxonomy Database , 2014, Nucleic Acids Res..

[20]  W. John Kress,et al.  Semantic tagging of and semantic enhancements to systematics papers: ZooKeys working examples , 2010, ZooKeys.

[21]  M. Watson,et al.  The Prometheus Taxonomic Model: a practical approach to representing multiple classifications. , 2000 .

[22]  Mark Blaxter,et al.  Molecular systematics: Counting angels with DNA , 2003, Nature.

[23]  Erez Lieberman Aiden,et al.  Quantitative Analysis of Culture Using Millions of Digitized Books , 2010, Science.

[24]  Shanlin Liu,et al.  Eupolybothrus cavernicolus Komerički & Stoev sp. n. (Chilopoda: Lithobiomorpha: Lithobiidae): the first eukaryotic species description combining transcriptomic, DNA barcoding and micro-CT imaging data , 2013, Biodiversity data journal.

[25]  Roderic D. M. Page,et al.  Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library , 2011, BMC Bioinformatics.

[26]  Roderic Page,et al.  Visualising Geophylogenies in Web Maps Using GeoJSON , 2015, PLoS currents.

[27]  Zhigang Jiang How many species are there on Earth , 2016 .

[28]  Roderic D. M. Page,et al.  Biodiversity informatics: the challenge of linking data and the role of shared identifiers , 2008, Briefings Bioinform..

[29]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[30]  John Kunze,et al.  Community Next Steps for Making Globally Unique Identifiers Work for Biocollections Data , 2015, ZooKeys.

[31]  Simon P. Wilson,et al.  Predicting total global species richness using rates of species description and estimates of taxonomic effort. , 2012, Systematic biology.

[32]  Donat Agosti,et al.  Taxonomic information exchange and copyright: the Plazi approach , 2009, BMC Research Notes.

[33]  Robert Aboukhalil,et al.  The rising trend in authorship , 2014 .

[34]  P. Bouchet,et al.  Supraspecific names of molluscs : a quantitative review , 1992 .

[35]  Robert M. May,et al.  How Many Species Are There on Earth? , 1988, Science.

[36]  Nico Cellinese,et al.  Evolutionary informatics: unifying knowledge about the diversity of life. , 2012, Trends in ecology & evolution.

[37]  P. Hebert,et al.  bold: The Barcode of Life Data System (http://www.barcodinglife.org) , 2007, Molecular ecology notes.

[38]  David King,et al.  Integrating and visualizing primary data from prospective and legacy taxonomic literature , 2015, Biodiversity data journal.

[39]  Roderic D. M. Page Surfacing the deep data of taxonomy , 2016, ZooKeys.

[40]  Kerrie Mengersen,et al.  Global species richness estimates have not converged. , 2014, Trends in ecology & evolution.

[41]  George Sangster,et al.  Declining rates of species described per taxonomist: slowdown of progress or a side-effect of improved quality in taxonomy? , 2015, Systematic biology.

[42]  Zoological Record And Registration Of New Names In Zoology , 2003 .

[43]  Nesrine Akkari,et al.  A New Dimension in Documenting New Species: High-Detail Imaging for Myriapod Taxonomy and First 3D Cybertype of a New Millipede Species (Diplopoda, Julida, Julidae) , 2015, PloS one.