Revisiting use of DNA characters in taxonomy with MolD - a tree independent algorithm to retrieve diagnostic nucleotide characters from monolocus datasets

While DNA characters are increasingly used for phylogenetic inference, taxa delimitation and identification, their use for formal description of taxa (i.e. providing either a formal description or a diagnosis) remains scarce and inconsistent. The impediments are neither nomenclatural, nor conceptual, but rather methodological issues: lack of agreement of what DNA character should be provided, and lack of a suitable operational algorithm to identify such characters. Furthermore, the reluctance of using DNA data in taxonomy may also be due to the concerns of insufficient reliability of DNA characters as robustness of the DNA based diagnoses has never been thoroughly assessed. Removing these impediments will enhance integrity of systematics, and will enable efficient treatment of traditionally problematic cases, such as for example, cryptic species. We have developed a novel versatile and scalable algorithm MolD to recover diagnostic combinations of nucleotides (DNCs) for pre-defined groups of DNA sequences, corresponding to taxa. We applied MolD to four published monolocus datasets to examine 1) which type of DNA characters compilation allows for more robust diagnosis, and 2) how the robustness of DNA based diagnosis changes depending on the sampled fraction of taxons diversity. We demonstrate that the redundant DNCs, termed herein sDNCs, allow for higher robustness. Furthermore, we show that a reliable DNA-based diagnosis may be obtained when a rather small fraction of the entire data set is available. Based on our results we propose improvements to the existing practices of handling DNA data in taxonomic descriptions, and discuss a workflow of contemporary systematic study, where the integrative taxonomy part precedes the proposition of a DNA based diagnosis and the diagnosis itself can be efficiently used as a DNA barcode. Our analysis fills existing methodological gaps, thus setting stage for a wider use of the DNA data in taxa description.

[1]  S. Kvist,et al.  DNA barcoding of odonates from the Upper Plata basin: Database creation and genetic diversity estimation , 2017, PloS one.

[2]  M. Schrödl,et al.  How to describe a cryptic species? Practical challenges of molecular taxonomy , 2013, Frontiers in Zoology.

[3]  R DeSalle,et al.  Character-based DNA barcoding allows discrimination of genera, species and populations in Odonata , 2007, Proceedings of the Royal Society B: Biological Sciences.

[4]  C. Moritz,et al.  DNA barcoding will often fail to discover new animal species over broad parameter space. , 2006, Systematic biology.

[5]  Indra Neil Sarkar,et al.  caos software for use in character‐based DNA barcoding , 2008, Molecular ecology resources.

[6]  B. Olivera,et al.  Biodiversity of cone snails and other venomous marine gastropods: evolutionary success through neuropharmacology. , 2014, Annual review of animal biosciences.

[7]  R. Vos,et al.  Species-Level Para- and Polyphyly in DNA Barcode Gene Trees: Strong Operational Bias in European Lepidoptera , 2016, Systematic biology.

[8]  Ingi Agnarsson,et al.  Taxonomy in a changing world: seeking solutions for a science in crisis. , 2007, Systematic biology.

[9]  Stephen Cameron,et al.  A genomic perspective on the shortcomings of mitochondrial DNA for "barcoding" identification. , 2006, The Journal of heredity.

[10]  E. Pante,et al.  Use of RAD sequencing for delimiting species , 2014, Heredity.

[11]  R. Zardoya,et al.  Beyond Conus: Phylogenetic relationships of Conidae based on complete mitochondrial genomes. , 2017, Molecular phylogenetics and evolution.

[12]  J. E. Rawlins,et al.  Integration of DNA barcoding into an ongoing inventory of complex tropical biodiversity , 2009, Molecular ecology resources.

[13]  B. Fontaine,et al.  21 years of shelf life between discovery and description of new species , 2012, Current Biology.

[14]  N. Baeshen,et al.  Biological Identifications Through DNA Barcodes , 2012 .

[15]  B. Carstens,et al.  Multilocus species delimitation in a complex of morphologically conserved trapdoor spiders (mygalomorphae, antrodiaetidae, aliatypus). , 2013, Systematic biology.

[16]  Michael D. Crisp,et al.  Need morphology always be required for new species descriptions , 2010 .

[17]  S. Claramunt Phylogenetic relationships among Synallaxini spinetails (Aves: Furnariidae) reveal a new biogeographic pattern across the Amazon and Paraná river basins. , 2014, Molecular phylogenetics and evolution.

[18]  B. Vanhoorne,et al.  World Register of Marine Species , 2013 .

[19]  C. Dunn Keeping taxonomy based in morphology , 2003 .

[20]  B. Schierwater,et al.  The marker choice: Unexpected resolving power of an unexplored CO1 region for layered DNA barcoding approaches , 2017, PloS one.

[21]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[22]  M. V. Modica,et al.  Phylogenetic classification of the family Terebridae (Neogastropoda: Conoidea) , 2019, Journal of Molluscan Studies.

[23]  D. Rubinoff,et al.  Between two extremes: mitochondrial DNA is neither the panacea nor the nemesis of phylogenetic and taxonomic inference. , 2005, Systematic biology.

[24]  Craig Moritz,et al.  Coalescent-based species delimitation in an integrative taxonomy. , 2012, Trends in ecology & evolution.

[25]  M. Schrödl,et al.  Barcoding against a paradox? Combined molecular species delineations reveal multiple cryptic lineages in elusive meiofaunal sea slugs , 2012, BMC Evolutionary Biology.

[26]  C. Cicero,et al.  Open access, freely available online Correspondence DNA Barcoding: Promise and Pitfalls , 2022 .

[27]  G. Wörheide,et al.  CO1 phylogenies in diploblasts and the 'Barcoding of Life' — are we sequencing a suboptimal partition? , 2006 .

[28]  Mark A. Miller,et al.  Creating the CIPRES Science Gateway for inference of large phylogenetic trees , 2010, 2010 Gateway Computing Environments Workshop (GCE).

[29]  Q. Wheeler,et al.  The perils of DNA barcoding and the need for integrative taxonomy. , 2005, Systematic biology.

[30]  E. Pante,et al.  From integrative taxonomy to species description: one step beyond. , 2015, Systematic biology.

[31]  Ángel A. Valdés,et al.  Molecular and morphological systematics of neustonic nudibranchs (Mollusca : Gastropoda : Glaucidae : Glaucus), with descriptions of three new cryptic species , 2014, Invertebrate Systematics.

[32]  J. Cracraft,et al.  LINEAGE DIVERSIFICATION AND MORPHOLOGICAL EVOLUTION IN A LARGE‐SCALE CONTINENTAL RADIATION: THE NEOTROPICAL OVENBIRDS AND WOODCREEPERS (AVES: FURNARIIDAE) , 2011, Evolution; international journal of organic evolution.

[33]  M. Haase,et al.  Molecular phylogeny and a modified approach of character-based barcoding refining the taxonomy of New Caledonian freshwater gastropods (Caenogastropoda, Truncatelloidea, Tateidae). , 2015, Molecular phylogenetics and evolution.

[34]  E. Gittenberger,et al.  Cryptic, adaptive radiation of endoparasitic snails: sibling species of Leptoconchus (Gastropoda: Coralliophilidae) in corals , 2011, Organisms Diversity & Evolution.

[35]  Kevin C. Nixon,et al.  Populations, Genetic Variation, and the Delimitation of Phylogenetic Species , 1992 .

[36]  S. Renner A Return to Linnaeus's Focus on Diagnosis, Not Description: The Use of DNA Characters in the Formal Naming of Species. , 2016, Systematic biology.

[37]  M. Servedio,et al.  Species delimitation in systematics: inferring diagnostic differences between species , 2000, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[38]  T. Backeljau,et al.  Dispersal and gene flow in free-living marine nematodes , 2013, Frontiers in Zoology.

[39]  R. Crozier,et al.  Without morphology, cryptic species stay in taxonomic crypsis following discovery. , 2007, Trends in ecology & evolution.

[40]  D. Tautz,et al.  A plea for DNA taxonomy , 2003 .

[41]  B. Schierwater,et al.  The potential of distance‐based thresholds and character‐based DNA barcoding for defining problematic taxonomic entities by CO1 and ND1 , 2013, Molecular ecology resources.

[42]  Rob DeSalle,et al.  Integrating DNA barcode data and taxonomic practice: Determination, discovery, and description , 2011, BioEssays : news and reviews in molecular, cellular and developmental biology.

[43]  M. Schrödl,et al.  How to use CAOS software for taxonomy? A quick guide to extract diagnostic nucleotides or amino acids for species descriptions , 2014 .

[44]  A. Lambert,et al.  ABGD, Automatic Barcode Gap Discovery for primary species delimitation , 2012, Molecular ecology.

[45]  Gang Chen,et al.  Patterns of Population Structure and Historical Demography of Conus Species in the Tropical Pacific* , 2012 .

[46]  S. Boyer,et al.  Spider: An R package for the analysis of species identity and evolution, with particular reference to DNA barcoding , 2012, Molecular ecology resources.

[47]  J. Lendemer,et al.  Sleepless nights: When you can't find anything to use but molecules to describe new taxa , 2014 .

[48]  Koichiro Tamura,et al.  MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. , 2013, Molecular biology and evolution.

[49]  C. Meyer,et al.  Molecular phylogeny and evolution of the cone snails (Gastropoda, Conoidea). , 2014, Molecular phylogenetics and evolution.

[50]  D. J. Funk,et al.  Species-Level Paraphyly and Polyphyly: Frequency, Causes, and Consequences, with Insights from Animal Mitochondrial DNA , 2003 .