The Utility of Genome Skimming for Phylogenomic Analyses as Demonstrated for Glycerid Relationships (Annelida, Glyceridae)

Glyceridae (Annelida) are a group of venomous annelids distributed worldwide from intertidal to abyssal depths. To trace the evolutionary history and complexity of glycerid venom cocktails, a solid backbone phylogeny of this group is essential. We therefore aimed to reconstruct the phylogenetic relationships of these annelids using Illumina sequencing technology. We constructed whole-genome shotgun libraries for 19 glycerid specimens and 1 outgroup species (Glycinde armigera). The chosen target genes comprise 13 mitochondrial proteins, 2 ribosomal mitochondrial genes, and 4 nuclear loci (18SrRNA, 28SrRNA, ITS1, and ITS2). Based on partitioned maximum likelihood as well as Bayesian analyses of the resulting supermatrix, we were finally able to resolve a robust glycerid phylogeny and identified three clades comprising the majority of taxa. Furthermore, we detected group II introns inside the cox1 gene of two analyzed glycerid specimens, with two different insertions in one of these species. Moreover, we generated reduced data sets comprising 10 million, 4 million, and 1 million reads from the original data sets to test the influence of the sequencing depth on assembling complete mitochondrial genomes from low coverage genome data. We estimated the coverage of mitochondrial genome sequences in each data set size by mapping the filtered Illumina reads against the respective mitochondrial contigs. By comparing the contig coverage calculated in all data set sizes, we got a hint for the scalability of our genome skimming approach. This allows estimating more precisely the number of reads that are at least necessary to reconstruct complete mitochondrial genomes in Glyceridae and probably non-model organisms in general.

[1]  M. Schüller Evidence for a role of bathymetry and emergence in speciation in the genus Glycera (Glyceridae, Polychaeta) from the deep Eastern Weddell Sea , 2011, Polar Biology.

[2]  Ramón Doallo,et al.  ProtTest 3: fast selection of best-fit models of protein evolution , 2011, Bioinform..

[3]  Nimrod D. Rubinstein,et al.  Deep Sequencing of Mixed Total DNA without Barcodes Allows Efficient Assembly of Highly Plastic Ascidian Mitochondrial Genomes , 2013, Genome biology and evolution.

[4]  C. Bleidorn,et al.  Annelid phylogeny and the status of Sipuncula and Echiura , 2007, BMC Evolutionary Biology.

[5]  C. Schander,et al.  Mitogenomics reveals phylogeny and repeated motifs in control regions of the deep-sea family Siboglinidae (Annelida). , 2015, Molecular phylogenetics and evolution.

[6]  Michael A. Thomas,et al.  Complete mitochondrial genome phylogeographic analysis of killer whales (Orcinus orca) indicates multiple species. , 2010, Genome research.

[7]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[8]  A. Vogler,et al.  Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics , 2015, Genome biology and evolution.

[9]  M. Hofreiter,et al.  Mitochondrial Genomes Reveal Slow Rates of Molecular Evolution and the Timing of Speciation in Beavers (Castor), One of the Largest Rodent Species , 2011, PloS one.

[10]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[11]  Korbinian Strimmer,et al.  APE: Analyses of Phylogenetics and Evolution in R language , 2004, Bioinform..

[12]  A. Lambowitz,et al.  Group II introns: mobile ribozymes that invade DNA. , 2011, Cold Spring Harbor perspectives in biology.

[13]  H. Pollard,et al.  Induction of ion-permeable channels by the venom of the fanged bloodworm Glycera dibranchiata. , 1982, Toxicon.

[14]  S. Pääbo,et al.  Multiplexed DNA Sequence Capture of Mitochondrial Genomes Using PCR Products , 2010, PloS one.

[15]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[16]  T. Kocher,et al.  Mitogenomics: digging deeper with complete mitochondrial genomes. , 1999, Trends in ecology & evolution.

[17]  C. Bleidorn,et al.  The complete mitochondrial genome of the orbiniid polychaete Orbinia latreillii (Annelida, Orbiniidae)--A novel gene order for Annelida and implications for annelid phylogeny. , 2006, Gene.

[18]  Peter F. Stadler,et al.  Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures , 2009, PLoS Comput. Biol..

[19]  B. Nickel,et al.  Illuminating the base of the annelid tree using transcriptomics. , 2014, Molecular biology and evolution.

[20]  Nicholas R Casewell,et al.  Complex cocktails: the evolutionary novelty of venoms. , 2013, Trends in ecology & evolution.

[21]  Kristian Fauchald,et al.  Phylogeny of the bristle worm family Eunicidae (Eunicida, Annelida) and the phylogenetic utility of noncongruent 16S, COI and 18S in combined analyses. , 2010, Molecular phylogenetics and evolution.

[22]  A. Vogler,et al.  Phylogenetic community ecology of soil biodiversity using mitochondrial metagenomics , 2015, Molecular ecology.

[23]  C. Bleidorn,et al.  On the phylogenetic position of Myzostomida: can 77 genes get it wrong? , 2009, BMC Evolutionary Biology.

[24]  Peer Bork,et al.  Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation , 2007, Bioinform..

[25]  Markus Böggemann Polychaetes (Annelida) of the abyssal SE Atlantic , 2009 .

[26]  A cost-effective straightforward protocol for shotgun Illumina libraries designed to assemble complete mitogenomes from non-model species , 2015, Conservation Genetics Resources.

[27]  Daniel Stubbs,et al.  PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. , 2013, Systematic biology.

[28]  J. M. González,et al.  A fluorimetric method for the estimation of G+C mol% content in microorganisms by thermal denaturation temperature. , 2002, Environmental microbiology.

[29]  G. Purschke,et al.  Mitochondrial genomes to the rescue--Diurodrilidae in the myzostomid trap. , 2013, Molecular phylogenetics and evolution.

[30]  Craig Moritz,et al.  Sequence capture using PCR‐generated probes: a cost‐effective method of targeted high‐throughput sequencing for nonmodel organisms , 2014, Molecular ecology resources.

[31]  J. Doležel,et al.  Nuclear DNA content and genome size of trout and human. , 2003, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[32]  Hao-ran Lin,et al.  The complete mitochondrial genome of the polychaete, Goniada japonica (Phyllodocida, Goniadidae) , 2016, Mitochondrial DNA. Part A, DNA mapping, sequencing, and analysis.

[33]  M. Clarke,et al.  Mitochondrial genome diversity and population structure of the giant squid Architeuthis: genetics sheds new light on one of the most enigmatic marine species , 2013, Proceedings of the Royal Society B: Biological Sciences.

[34]  E. Ehlers Die Borstenwürmer (Annelida Chaetopoda) nach systematischen und anatomischen Untersuchungen , 1864 .

[35]  H. Philippe,et al.  A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. , 2004, Molecular biology and evolution.

[36]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[37]  E. Robin,et al.  Mitochondrial DNA molecules and virtual number of mitochondria per cell in mammalian cells , 1988, Journal of cellular physiology.

[38]  A. Künstner,et al.  ConDeTri - A Content Dependent Read Trimmer for Illumina Data , 2011, PloS one.

[39]  L. Moroz,et al.  Phylogenomics reveals deep molluscan relationships , 2011, Nature.

[40]  A. Vogler,et al.  Bulk De Novo Mitogenome Assembly from Pooled Total DNA Elucidates the Phylogeny of Weevils (Coleoptera: Curculionoidea) , 2014, Molecular biology and evolution.

[41]  T. Ryan Gregory,et al.  Eukaryotic genome size databases , 2006, Nucleic Acids Res..

[42]  C. Bon,et al.  Partial purification of the Glycera convoluta venom components responsible for its presynaptic effects. , 1982, Journal de physiologie.

[43]  Wei Qian,et al.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. , 2000, Molecular biology and evolution.

[44]  Robert Olson,et al.  Database for mobile group II introns , 2003, Nucleic Acids Res..

[45]  E. Lander,et al.  Genomic mapping by fingerprinting random clones: a mathematical analysis. , 1988, Genomics.

[46]  M. Milinkovitch,et al.  Mitochondrial genome and nuclear sequence data support myzostomida as part of the annelid radiation. , 2007, Molecular biology and evolution.

[47]  L. Bachmann,et al.  Reconstructing mitochondrial genomes directly from genomic next-generation sequencing reads—a baiting and iterative mapping approach , 2013, Nucleic acids research.

[48]  C. Bleidorn,et al.  Mitochondrial genome sequence and gene order of Sipunculus nudus give additional support for an inclusion of Sipuncula into Annelida , 2009, BMC Genomics.

[49]  Markus Böggemann,et al.  Revision of the Glyceridae Grube 1850 (Annelida: Polychaeta) , 2002 .

[50]  Andrea Tanzer,et al.  A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection , 2014, Genome Biology.

[51]  P. Stadler,et al.  MITOS: improved de novo metazoan mitochondrial genome annotation. , 2013, Molecular phylogenetics and evolution.

[52]  N. Morel,et al.  Binding of a Glycera convoluta neurotoxin to cholinergic nerve terminal plasma membranes , 1983, The Journal of cell biology.

[53]  Martin Kircher,et al.  Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform , 2011, Nucleic acids research.

[54]  A. Vogler,et al.  Soup to Tree: The Phylogeny of Beetles Inferred by Mitochondrial Metagenomics of a Bornean Rainforest Sample , 2015, Molecular biology and evolution.

[55]  S. Sekida,et al.  Molecular anatomy of tunicate senescence: reversible function of mitochondrial and nuclear genes associated with budding cycles , 2012, Development.

[56]  B. Lang,et al.  Mitochondrial DNA of Clathrina clathrus (Calcarea, Calcinea): six linear chromosomes, fragmented rRNAs, tRNA editing, and a novel genetic code. , 2013, Molecular biology and evolution.

[57]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[58]  Janet Kelso,et al.  leeHom: adaptor trimming and merging for Illumina sequencing reads , 2014, Nucleic acids research.

[59]  B. Lang,et al.  Sequencing complete mitochondrial and plastid genomes , 2007, Nature Protocols.

[60]  T. Gregory Genome Size Evolution in Animals , 2005 .

[61]  G. Rouse,et al.  Polychaete systematics: Past and present , 1997 .

[62]  G. Schiavo,et al.  Glycerotoxin from Glycera convoluta stimulates neurosecretion by up‐regulating N‐type Ca2+ channel activity , 2002, The EMBO journal.

[63]  Gerard Talavera,et al.  Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. , 2007, Systematic biology.

[64]  Janet Kelso,et al.  freeIbis: an efficient basecaller with calibrated quality scores for Illumina sequencers , 2013, Bioinform..

[65]  A. Lambowitz,et al.  Interaction of a group II intron ribonucleoprotein endonuclease with its DNA target site investigated by DNA footprinting and modification interference. , 2001, Journal of molecular biology.

[66]  Nicolas Lartillot,et al.  PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating , 2009, Bioinform..

[67]  F. Delsuc,et al.  Next-generation sequencing and phylogenetic signal of complete mitochondrial genomes for resolving the evolutionary history of leaf-nosed bats (Phyllostomidae). , 2013, Molecular phylogenetics and evolution.

[68]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[69]  P. Foster,et al.  The complete mitochondrial genome of a turbinid vetigastropod from MiSeq Illumina sequencing of genomic DNA and steps towards a resolved gastropod phylogeny. , 2014, Gene.

[70]  K. Halanych,et al.  Mitochondrial genomes of Clymenella torquata (Maldanidae) and Riftia pachyptila (Siboglinidae): evidence for conserved gene order in annelida. , 2005, Molecular biology and evolution.

[71]  P. Foster,et al.  Next generation sequencing and comparative analyses of Xenopus mitogenomes , 2012, BMC Genomics.

[72]  Anne Weigert,et al.  The making of a branching annelid: an analysis of complete mitochondrial genome and ribosomal data of Ramisyllis multicaudata , 2015, Scientific Reports.

[73]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[74]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[75]  G. Giribet,et al.  Phylogenomic Analysis of Spiders Reveals Nonmonophyly of Orb Weavers , 2014, Current Biology.

[76]  G. Hausner,et al.  Phylogenetic relationships among group II intron ORFs. , 2001, Nucleic acids research.

[77]  Siu-Ming Yiu,et al.  IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth , 2012, Bioinform..

[78]  C. Bon,et al.  Partial purification of α-glycerotoxin, a presynaptic neurotoxin from the venom glands of the polychaete annelid glycera convoluta , 1985, Neurochemistry International.

[79]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[80]  B. Schierwater,et al.  Mitochondrial genome of Trichoplax adhaerens supports placozoa as the basal lower metazoan phylum. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[81]  X. Xia DAMBE5: A Comprehensive Software Package for Data Analysis in Molecular Biology and Evolution , 2013, Molecular biology and evolution.

[82]  G. Purschke,et al.  Detecting possibly saturated positions in 18S and 28S sequences and their influence on phylogenetic reconstruction of Annelida (Lophotrochozoa). , 2008, Molecular phylogenetics and evolution.

[83]  Travis C. Glenn,et al.  A Phylogeny of Birds Based on Over 1,500 Loci Collected by Target Enrichment and High-Throughput Sequencing , 2012, PloS one.

[84]  J. Boore,et al.  The complete mitochondrial genome of the articulate brachiopod Terebratalia transversa. , 2001, Molecular biology and evolution.

[85]  Matthias Meyer,et al.  Illumina sequencing library preparation for highly multiplexed target capture and sequencing. , 2010, Cold Spring Harbor protocols.

[86]  G. Steiner,et al.  The complete sequence and gene organization of the mitochondrial genome of the gadilid scaphopod Siphonondentalium lobatum (Mollusca). , 2004, Molecular phylogenetics and evolution.

[87]  N. Perna,et al.  Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes , 1995, Journal of Molecular Evolution.

[88]  R. Norton,et al.  The toxicogenomic multiverse: convergent recruitment of proteins into animal venoms. , 2009, Annual review of genomics and human genetics.

[89]  R. Jenner,et al.  Quo Vadis Venomics? A Roadmap to Neglected Venomous Invertebrates , 2014, Toxins.

[90]  J. Boore,et al.  Group II Introns Break New Boundaries: Presence in a Bilaterian's Genome , 2008, PloS one.

[91]  A. Lemmon,et al.  Anchored hybrid enrichment for massively high-throughput phylogenomics. , 2012, Systematic biology.

[92]  R. Jenner,et al.  A Polychaete’s Powerful Punch: Venom Gland Transcriptomics of Glycera Reveals a Complex Cocktail of Toxin Homologs , 2014, Genome biology and evolution.

[93]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.

[94]  T. A. Hall,et al.  BIOEDIT: A USER-FRIENDLY BIOLOGICAL SEQUENCE ALIGNMENT EDITOR AND ANALYSIS PROGRAM FOR WINDOWS 95/98/ NT , 1999 .

[95]  L J Kricka,et al.  Evaluation of DNA fragment sizing and quantification by the agilent 2100 bioanalyzer. , 2000, Clinical chemistry.