Index-Free De Novo Assembly and Deconvolution of Mixed Mitochondrial Genomes

Second-generation sequencing technology has allowed a very large increase in sequencing throughput. In order to make use of this high throughput, we have developed a pipeline for sequencing and de novo assembly of multiple mitochondrial genomes without the costs of indexing. Simulation studies on a mixture of diverse animal mitochondrial genomes showed that mitochondrial genomes could be reassembled from a high coverage of short (35 nt) reads, such as those generated by a second-generation Illumina Genome Analyzer. We then assessed this experimentally with long-range polymerase chain reaction products from mitochondria of a human, a rat, a bird, a frog, an insect, and a mollusc. Comparison with reference genomes was used for deconvolution of the assembled contigs rather than for mapping of sequence reads. As proof of concept, we report the complete mollusc mitochondrial genome of an olive shell (Amalda northlandica). It has a very unusual putative control region, which contains a structure that would probably only be detectable by next-generation sequencing. The general approach has considerable potential, especially when combined with indexed sequencing of different groups of genomes.

[1]  R. Gibbs,et al.  A clone-array pooled shotgun strategy for sequencing large genomes. , 2001, Genome research.

[2]  M. Hasegawa,et al.  Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of the Japanese pond frog Rana nigromaculata. , 2001, Genes & genetic systems.

[3]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[4]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[5]  Gabor T. Marth,et al.  Whole-genome sequencing and variant discovery in C. elegans , 2008, Nature Methods.

[6]  R. Zardoya,et al.  Phylogenetic relationships of discoglossid frogs (Amphibia:Anura:Discoglossidae) based on complete mitochondrial genomes and nuclear genes. , 2004, Gene.

[7]  B. Olivera,et al.  The mitochondrial genome of Conus textile, coxI-coxII intergenic sequences and Conoidean evolution. , 2008, Molecular phylogenetics and evolution.

[8]  A. Okada,et al.  Sequence and organization of the human T cell delta chain gene. , 1988, European journal of immunology.

[9]  Juliane C. Dohm,et al.  Substantial biases in ultra-short read data sets from high-throughput DNA sequencing , 2008, Nucleic acids research.

[10]  K. Watanabe,et al.  7-Methylguanosine at the anticodon wobble position of squid mitochondrial tRNA(Ser)GCU: molecular basis for assignment of AGA/AGG codons as serine in invertebrate mitochondria. , 1998, Biochimica et biophysica acta.

[11]  Dario Leister,et al.  NUMTs in sequenced eukaryotic genomes. , 2004, Molecular biology and evolution.

[12]  Haixu Tang,et al.  Fragment assembly with short reads , 2004, Bioinform..

[13]  D. Penny,et al.  Toward resolving deep neoaves phylogeny: data, signal enhancement, and priors. , 2009, Molecular biology and evolution.

[14]  M. Hasegawa,et al.  Extensive mitochondrial gene arrangements in coleoid Cephalopoda and their phylogenetic implications. , 2006, Molecular phylogenetics and evolution.

[15]  S. Yokobori,et al.  Gene Contents and Organization of a Mitochondrial DNA Segment of the Squid Loligo bleekeri , 1999, Journal of Molecular Evolution.

[16]  J. Boore,et al.  Rolling circle amplification of metazoan mitochondrial genomes. , 2005, Molecular phylogenetics and evolution.

[17]  E. Liu,et al.  Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. , 2009, Genome research.

[18]  David Hernández,et al.  De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer. , 2008, Genome research.

[19]  F. Sanger,et al.  DNA sequencing with chain-terminating inhibitors. , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[20]  D. Penny,et al.  Mitochondrial genomes and avian phylogeny: complex characters and resolvability without explosive radiations. , 2006, Molecular biology and evolution.

[21]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[22]  Emden R. Gansner,et al.  An open graph visualization system and its applications to software engineering , 2000, Softw. Pract. Exp..

[23]  B. Olivera,et al.  Complete mitochondrial DNA sequence of a Conoidean gastropod, Lophiotoma (Xenuroturris) cerithiformis: gene order and gastropod phylogeny. , 2006, Toxicon : official journal of the International Society on Toxinology.

[24]  G. Hannon,et al.  DNA Sudoku--harnessing high-throughput sequencing for multiplexed specimen analysis. , 2009, Genome research.

[25]  Jose V. Lopez,et al.  Numt, a recent transfer and tandem amplification of mitochondrial DNA to the nuclear genome of the domestic cat , 1994, Journal of Molecular Evolution.

[26]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[27]  Á. Spotorno,et al.  Radiation of Extant Marsupials After the K/T Boundary: Evidence from Complete Mitochondrial Genomes , 2003, Journal of Molecular Evolution.

[28]  B. Lang,et al.  The transcription of DNA in chicken mitochondria initiates from one major bidirectional promoter. , 1991, The Journal of biological chemistry.

[29]  Daniel N. Frank,et al.  BARCRAWL and BARTAB: software tools for the design and implementation of barcoded primers for highly multiplexed DNA sequencing , 2009, BMC Bioinformatics.

[30]  F. Sanger,et al.  Sequence and organization of the human mitochondrial genome , 1981, Nature.

[31]  Michael Zuker,et al.  Algorithms and Thermodynamics for RNA Secondary Structure Prediction: A Practical Guide , 1999 .

[32]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[33]  M. Whiting,et al.  A preliminary mitochondrial genome phylogeny of Orthoptera (Insecta) and approaches to maximizing phylogenetic signal found within mitochondrial genome data. , 2008, Molecular phylogenetics and evolution.

[34]  Mihai Pop,et al.  Assembly complexity of prokaryotic genomes using short reads , 2010, BMC Bioinformatics.

[35]  D. Ray,et al.  The crocodilian mitochondrial control region: general structure, conserved sequences, and evolutionary implications. , 2002, The Journal of experimental zoology.

[36]  S. Cha,et al.  The complete nucleotide sequence and gene organization of the mitochondrial genome of the oriental mole cricket, Gryllotalpa orientalis (Orthoptera: Gryllotalpidae). , 2005, Gene.

[37]  S. Pääbo,et al.  Mitochondrial genome variation and the origin of modern humans , 2000, Nature.

[38]  D. Bentley,et al.  Whole-genome re-sequencing. , 2006, Current opinion in genetics & development.

[39]  R. Zardoya,et al.  Neogastropod phylogenetic relationships based on entire mitochondrial genomes , 2009, BMC Evolutionary Biology.

[40]  E. Matisoo-Smith,et al.  Dating of divergences within the Rattus genus phylogeny using whole mitochondrial genomes. , 2008, Molecular phylogenetics and evolution.

[41]  R. Nichols,et al.  Gene trees and species trees are not the same. , 2001, Trends in ecology & evolution.

[42]  J. Boore,et al.  The use of genome-level characters for phylogenetic reconstruction. , 2006, Trends in ecology & evolution.

[43]  Jan Barciszewski,et al.  RNA Biochemistry and Biotechnology , 1999 .

[44]  L. Du,et al.  Multiplex sequencing of paired-end ditags (MS-PET): a strategy for the ultra-high-throughput analysis of transcriptomes and genomes , 2006, Nucleic acids research.