University of Birmingham From cheek swabs to consensus sequences

Background: Next-generation DNA sequencing (NGS) technologies have made huge impacts in many fields of biological research, but especially in evolutionary biology. One area where NGS has shown potential is for high-throughput sequencing of complete mtDNA genomes (of humans and other animals). Despite the increasing use of NGS technologies and a better appreciation of their importance in answering biological questions, there remain significant obstacles to the successful implementation of NGS-based projects, especially for new users. Results: Here we present an ‘A to Z’ protocol for obtaining complete human mitochondrial (mtDNA) genomes – from DNA extraction to consensus sequence. Although designed for use on humans, this protocol could also be used to sequence small, organellar genomes from other species, and also nuclear loci. This protocol includes DNA extraction, PCR amplification, fragmentation of PCR products, barcoding of fragments, sequencing using the 454 GS FLX platform, and a complete bioinformatics pipeline (primer removal, reference-based mapping, output of coverage plots and SNP calling). Conclusions: All steps in this protocol are designed to be straightforward to implement, especially for researchers who are undertaking next-generation sequencing for the first time. The molecular steps are scalable to large numbers (hundreds) of individuals and all steps post-DNA extraction can be carried out in 96-well plate format. Also, the protocol has been assembled so that individual ‘modules’ can be swapped out to suit available resources.

[1]  A. Wilm,et al.  LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets , 2012, Nucleic acids research.

[2]  E. Matisoo-Smith,et al.  Complete mitochondrial DNA genome sequences from the first New Zealanders , 2012, Proceedings of the National Academy of Sciences.

[3]  Stinus Lindgreen,et al.  AdapterRemoval: easy cleaning of next-generation sequencing reads , 2012, BMC Research Notes.

[4]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[5]  D. Jaffe,et al.  Molecular Diagnosis of Infantile Mitochondrial Disease with Targeted Next-Generation Sequencing , 2012, Science Translational Medicine.

[6]  Joshua S. Paul,et al.  Genotype and SNP calling from next-generation sequencing data , 2011, Nature Reviews Genetics.

[7]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[8]  Eloisa Arbustini,et al.  Mitochondrial DNA Variant Discovery and Evaluation in Human Cardiomyopathies through Next-Generation Sequencing , 2010, PloS one.

[9]  Forest Rohwer,et al.  TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets , 2010, BMC Bioinformatics.

[10]  Martin Kircher,et al.  High‐throughput DNA sequencing – concepts and limitations , 2010, BioEssays : news and reviews in molecular, cellular and developmental biology.

[11]  S. Koren,et al.  Assembly algorithms for next-generation sequencing data. , 2010, Genomics.

[12]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[13]  Tom Royce,et al.  A comprehensive catalogue of somatic mutations from a human cancer genome , 2010, Nature.

[14]  Robin B. Gasser,et al.  An integrated pipeline for next-generation sequencing and annotation of mitochondrial genomes , 2009, Nucleic acids research.

[15]  A. Jex,et al.  Toward next-generation sequencing of mitochondrial genomes--focus on parasitic worms of animals and biotechnological implications. , 2010, Biotechnology advances.

[16]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[17]  Jay Shendure,et al.  Next generation sequence analysis for mitochondrial disorders , 2009, Genome Medicine.

[18]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[19]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[20]  Antonis Rokas,et al.  Harnessing genomics for evolutionary insights. , 2009, Trends in ecology & evolution.

[21]  Philip L. F. Johnson,et al.  A Complete Neandertal Mitochondrial Genome Sequence Determined by High-Throughput Sequencing , 2008, Cell.

[22]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[23]  Gabor T. Marth,et al.  Pyrobayes: an improved base caller for SNP discovery in pyrosequences , 2008, Nature Methods.

[24]  U. Stenzel,et al.  Parallel tagged sequencing on the 454 platform , 2008, Nature Protocols.

[25]  S. Schuster Next-generation sequencing transforms today's biology , 2008, Nature Methods.

[26]  A. Jex,et al.  Long PCR amplification of the entire mitochondrial genome from individual helminths for direct sequencing , 2007, Nature Protocols.

[27]  M. Hurles,et al.  Deciphering past human population movements in Oceania: provably optimal trees of 127 mtDNA genomes. , 2006, Molecular biology and evolution.

[28]  Thomas LaFramboise,et al.  Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing , 2006, Nature Medicine.

[29]  Robert W. Taylor,et al.  High levels of mitochondrial DNA deletions in substantia nigra neurons in aging and Parkinson disease , 2006, Nature Genetics.

[30]  Laura C. Greaves,et al.  Mitochondrial DNA mutations in human disease , 2006, IUBMB life.

[31]  J. Boore,et al.  Rolling circle amplification of metazoan mitochondrial genomes. , 2005, Molecular phylogenetics and evolution.

[32]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[33]  D. Turnbull,et al.  Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA , 1999, Nature Genetics.

[34]  Thomas Wetter,et al.  Genome Sequence Assembly Using Trace Signals and Additional Sequence Information , 1999, German Conference on Bioinformatics.

[35]  M. Ronaghi,et al.  A Sequencing Method Based on Real-Time Pyrophosphate , 1998, Science.

[36]  E. Hagelberg,et al.  Molecular instability in the COII-tRNA(Lys) intergenic region of the human mitochondrial genome: multiple origins of the 9-bp deletion and heteroplasmy for expanded repeats. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[37]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[38]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.