ABGD, Automatic Barcode Gap Discovery for primary species delimitation

Within uncharacterized groups, DNA barcodes, short DNA sequences that are present in a wide range of species, can be used to assign organisms into species. We propose an automatic procedure that sorts the sequences into hypothetical species based on the barcode gap, which can be observed whenever the divergence among organisms belonging to the same species is smaller than divergence among organisms from different species. We use a range of prior intraspecific divergence to infer from the data a model‐based one‐sided confidence limit for intraspecific divergence. The method, called Automatic Barcode Gap Discovery (ABGD), then detects the barcode gap as the first significant gap beyond this limit and uses it to partition the data. Inference of the limit and gap detection are then recursively applied to previously obtained groups to get finer partitions until there is no further partitioning. Using six published data sets of metazoans, we show that ABGD is computationally efficient and performs well for standard prior maximum intraspecific divergences (a few per cent of divergence for the five data sets), except for one data set where less than three sequences per species were sampled. We further explore the theoretical limitations of ABGD through simulation of explicit speciation and population genetics scenarios. Our results emphasize in particular the sensitivity of the method to the presence of recent speciation events, via (unrealistically) high rates of speciation or large numbers of species. In conclusion, ABGD is fast, simple method to split a sequence alignment data set into candidate species that should be complemented with other evidence in an integrative taxonomic approach.

[1]  R. I. Hill,et al.  Limited performance of DNA barcoding in a diverse community of tropical butterflies , 2007, Proceedings of the Royal Society B: Biological Sciences.

[2]  J. Sites,et al.  Delimiting species: a Renaissance issue in systematic biology , 2003 .

[3]  G. Ståhls,et al.  MtDNA COI barcodes reveal cryptic diversity in the Baetis vernus group (Ephemeroptera, Baetidae). , 2008, Molecular phylogenetics and evolution.

[4]  Alfried P Vogler,et al.  Sequence-based species delimitation for the DNA taxonomy of undescribed insects. , 2006, Systematic biology.

[5]  M. Stoeckle Taxonomy, DNA, and the Bar Code of Life , 2003 .

[6]  K. Ross,et al.  Species delimitation: a case study in a problematic ant taxon. , 2010, Systematic biology.

[7]  J. Rozas,et al.  Statistical properties of new neutrality tests against population growth. , 2002, Molecular biology and evolution.

[8]  P. Choler,et al.  Assessment of Microbial Communities by Graph Partitioning in a Study of Soil Fungi in Two Alpine Meadows , 2009, Applied and Environmental Microbiology.

[9]  M. Slatkin,et al.  Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. , 1991, Genetics.

[10]  E. Goetze Species discovery in marine planktonic invertebrates through global molecular screening , 2010, Molecular ecology.

[11]  Michael Balke,et al.  Accelerated species inventory on Madagascar using coalescent-based models of species delineation. , 2009, Systematic biology.

[12]  N. Baeshen,et al.  Biological Identifications Through DNA Barcodes , 2012 .

[13]  A. Lambert The Allelic Partition for Coalescent Point Processes , 2008, 0804.2572.

[14]  J. Ferguson On the use of genetic divergence for identifying species , 2002 .

[15]  Paul D. Johnson,et al.  Identification of ‘extinct’ freshwater mussel species using DNA barcoding , 2008, Molecular ecology resources.

[16]  D. Janzen,et al.  DNA barcodes distinguish species of tropical Lepidoptera. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[17]  R. Meier,et al.  The use of mean instead of smallest interspecific distances exaggerates the size of the "barcoding gap" and leads to misidentification. , 2008, Systematic biology.

[18]  Amy K Stockman,et al.  An integrative method for delimiting cohesion species: finding the population-species interface in a group of Californian trapdoor spiders with extreme genetic divergence and geographic structuring. , 2008, Systematic biology.

[19]  S. Miller DNA barcoding and the renaissance of taxonomy , 2007, Proceedings of the National Academy of Sciences.

[20]  R. Hughes,et al.  Mating trials validate the use of DNA barcoding to reveal cryptic speciation of a marine bryozoan taxon , 2007, Proceedings of the Royal Society B: Biological Sciences.

[21]  S. Ball,et al.  DNA barcodes for biosecurity: invasive species identification , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[22]  Z. Yang,et al.  Probability models for DNA sequence evolution , 2004, Heredity.

[23]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[24]  R. Griffiths,et al.  An ancestral recombination graph , 1997 .

[25]  D. Marcogliese,et al.  DNA barcodes show cryptic diversity and a potential physiological basis for host specificity among Diplostomoidea (Platyhelminthes: Digenea) parasitizing freshwater fishes in the St. Lawrence River, Canada , 2010, Molecular ecology.

[26]  G. Churchill,et al.  Properties of statistical tests of neutrality for DNA polymorphism data. , 1995, Genetics.

[27]  C. J-F,et al.  THE COALESCENT , 1980 .

[28]  A. Vogler,et al.  DNA-based taxonomy for associating adults and larvae in multi-species assemblages of chafers (Coleoptera: Scarabaeidae). , 2007, Molecular phylogenetics and evolution.

[29]  M. Vences,et al.  Deciphering amphibian diversity through DNA barcoding: chances and challenges , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[30]  B. Schierwater,et al.  An integrative approach to species discovery in odonates: from character‐based DNA barcoding to ecology , 2010, Molecular ecology.

[31]  P. Hebert,et al.  Comprehensive DNA barcode coverage of North American birds , 2007, Molecular ecology notes.

[32]  L Lacey Knowles,et al.  Estimating species trees: methods of phylogenetic analysis when there is incongruence across genes. , 2009, Systematic biology.

[33]  B. Dayrat,et al.  Towards integrative taxonomy , 2005 .

[34]  M. Blaxter The promise of a DNA taxonomy. , 2004, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[35]  B. Rannala Gene genealogy in a population of variable size , 1997, Heredity.

[36]  D. Janzen,et al.  Wedding biodiversity inventory of a large and complex Lepidoptera fauna with DNA barcoding , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[37]  Brian C. O'Meara,et al.  New Heuristic Methods for Joint Species Delimitation and Species Tree Inference , 2009, Systematic biology.

[38]  M. Smith,et al.  DNA BARCODING: CO1 DNA barcoding amphibians: take the chance, meet the challenge , 2008, Molecular ecology resources.

[39]  S. Ratnasingham,et al.  Biological identifications through DNA barcodes: the case of the Crustacea , 2007 .

[40]  James Rosindell,et al.  Unified neutral theory of biodiversity and biogeography , 2010, Scholarpedia.

[41]  H. Harpending,et al.  Population growth makes waves in the distribution of pairwise genetic differences. , 1992, Molecular biology and evolution.

[42]  M. Dawson,et al.  Global phylogeography of Cassiopea (Scyphozoa: Rhizostomeae): molecular evidence for cryptic species and multiple invasions of the Hawaiian Islands , 2004 .

[43]  K. de Queiroz,et al.  Species concepts and species delimitation. , 2007, Systematic biology.

[44]  S. Samadi,et al.  Species Delimitation In The Genus Bythinella (Mollusca: Caenogastropoda: Rissooidea): A First Attempt Combining Molecular And Morphometrical Data , 2007 .

[45]  R DeSalle,et al.  Character-based DNA barcoding allows discrimination of genera, species and populations in Odonata , 2007, Proceedings of the Royal Society B: Biological Sciences.

[46]  M. Wiemers,et al.  Does the DNA barcoding gap exist? – a case study in blue butterflies (Lepidoptera: Lycaenidae) , 2007, Frontiers in Zoology.

[47]  John W.H. Trueman,et al.  Integrative taxonomy, or iterative taxonomy? , 2011 .

[48]  N. Ivanova,et al.  DNA barcodes for Cladocera and Copepoda from Mexico and Guatemala, highlights and new discoveries , 2008 .

[49]  P. Hebert,et al.  DNA barcoding for effective biodiversity assessment of a hyperdiverse arthropod group: the ants of Madagascar , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[50]  Stephen T. Sherry,et al.  The Genetic Structure of Ancient Human Populations , 1993, Current Anthropology.

[51]  P. Hebert,et al.  DNA barcoding of Neotropical bats: species identification and discovery within Guyana , 2007 .

[52]  Q. Wheeler,et al.  The perils of DNA barcoding and the need for integrative taxonomy. , 2005, Systematic biology.

[53]  A. E. Hirsh,et al.  On the use of star-shaped genealogies in inference of coalescence times. , 2003, Genetics.

[54]  Aurélien Miralles,et al.  The integrative future of taxonomy , 2010, Frontiers in Zoology.

[55]  M. Pfenninger,et al.  A species delimitation approach in the Trochulus sericeus/hispidus complex reveals two cryptic species within a sharp contact zone , 2009, BMC Evolutionary Biology.

[56]  R. Hudson Gene genealogies and the coalescent process. , 1990 .

[57]  F. Tajima Evolutionary relationship of DNA sequences in finite populations. , 1983, Genetics.

[58]  Olivier David,et al.  DNA barcode analysis: a comparison of phylogenetic and statistical classification methods , 2009, BMC Bioinformatics.

[59]  Mark Blaxter,et al.  Molecular barcodes for soil nematode identification , 2002, Molecular ecology.

[60]  J. Wiens Species delimitation: new approaches for discovering diversity. , 2007, Systematic biology.

[61]  C. Meyer,et al.  DNA Barcoding: Error Rates Based on Comprehensive Sampling , 2005, PLoS biology.

[62]  P. Hebert,et al.  Identification of Birds through DNA Barcodes , 2004, PLoS biology.

[63]  W. John Kress,et al.  A DNA barcode for land plants , 2009, Proceedings of the National Academy of Sciences.

[64]  Alfried P. Vogler,et al.  Recent advances in DNA taxonomy , 2007 .

[65]  R. DeSalle Species Discovery versus Species Identification in DNA Barcoding Efforts: Response to Rubinoff , 2006, Conservation biology : the journal of the Society for Conservation Biology.

[66]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[67]  T. Giraud,et al.  Speciation in fungi. , 2008, Fungal genetics and biology : FG & B.

[68]  P. Taberlet,et al.  New perspectives in diet analysis based on DNA barcoding and parallel pyrosequencing: the trnL approach , 2009, Molecular ecology resources.

[69]  F. Bakker,et al.  DNA barcoding reveals hidden species diversity in Cymothoe (Nymphalidae) , 2007 .

[70]  C. Roberts Advocating against advocacy in fisheries management: Fisheries Ecology and Management by Carl J. Walters and Steven J.D. Martell. Princeton University Press, 2004. US$99.50/US$45.00 hbk/pbk (448 pages) ISBN 0 691 11544 3 , 2004 .

[71]  Amaury Lambert,et al.  Population Dynamics and Random Genealogies , 2008 .