Mitochondrial genome sequence analysis: A custom bioinformatics pipeline substantially improves Affymetrix MitoChip v2.0 call rate and accuracy

BackgroundMitochondrial genome sequence analysis is critical to the diagnostic evaluation of mitochondrial disease. Existing methodologies differ widely in throughput, complexity, cost efficiency, and sensitivity of heteroplasmy detection. Affymetrix MitoChip v2.0, which uses a sequencing-by-genotyping technology, allows potentially accurate and high-throughput sequencing of the entire human mitochondrial genome to be completed in a cost-effective fashion. However, the relatively low call rate achieved using existing software tools has limited the wide adoption of this platform for either clinical or research applications. Here, we report the design and development of a custom bioinformatics software pipeline that achieves a much improved call rate and accuracy for the Affymetrix MitoChip v2.0 platform. We used this custom pipeline to analyze MitoChip v2.0 data from 24 DNA samples representing a broad range of tissue types (18 whole blood, 3 skeletal muscle, 3 cell lines), mutations (a 5.8 kilobase pair deletion and 6 known heteroplasmic mutations), and haplogroup origins. All results were compared to those obtained by at least one other mitochondrial DNA sequence analysis method, including Sanger sequencing, denaturing HPLC-based heteroduplex analysis, and/or the Illumina Genome Analyzer II next generation sequencing platform.ResultsAn average call rate of 99.75% was achieved across all samples with our custom pipeline. Comparison of calls for 15 samples characterized previously by Sanger sequencing revealed a total of 29 discordant calls, which translates to an estimated 0.012% for the base call error rate. We successfully identified 4 known heteroplasmic mutations and 24 other potential heteroplasmic mutations across 20 samples that passed quality control.ConclusionsAffymetrix MitoChip v2.0 analysis using our optimized MitoChip Filtering Protocol (MFP) bioinformatics pipeline now offers the high sensitivity and accuracy needed for reliable, high-throughput and cost-efficient whole mitochondrial genome sequencing. This approach provides a viable alternative of potential utility for both clinical diagnostic and research applications to traditional Sanger and other emerging sequencing technologies for whole mitochondrial genome analysis.

[1]  L. Wong,et al.  Analysis of Mitochondrial DNA Point Mutation Heteroplasmy by ARMS Quantitative PCR , 2011, Current protocols in human genetics.

[2]  Hans-Jürgen Bandelt,et al.  Phylogeny of mitochondrial DNA macrohaplogroup N in India, based on complete sequencing: implications for the peopling of South Asia. , 2004, American journal of human genetics.

[3]  Paul D. Shaw,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[4]  Jaume Bertranpetit,et al.  The dawn of human matrilineal diversity. , 2008, American journal of human genetics.

[5]  F. Sanger,et al.  Sequence and organization of the human mitochondrial genome , 1981, Nature.

[6]  Sha Tang,et al.  Characterization of mitochondrial DNA heteroplasmy using a parallel sequencing system. , 2010, BioTechniques.

[7]  D. Turnbull,et al.  Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA , 1999, Nature Genetics.

[8]  Teri A. Crosby,et al.  How to Detect and Handle Outliers , 1993 .

[9]  A. Chakravarti,et al.  The Human MitoChip: a high-throughput sequencing microarray for mitochondrial mutation detection. , 2004, Genome research.

[10]  C. Petit,et al.  Whole mitochondrial genome screening in maternally inherited non-syndromic hearing impairment using a microarray resequencing mitochondrial DNA chip , 2007, European Journal of Human Genetics.

[11]  David M. Kramer,et al.  Biochemistry and Molecular Biology , 1968, Nature.

[12]  D. Cutler,et al.  An oligonucleotide microarray for high-throughput sequencing of the mitochondrial genome. , 2006, The Journal of molecular diagnostics : JMD.

[13]  Joshua Lederberg,et al.  Children's Hospital of Philadelphia. , 1975, The Australasian nurses journal.

[14]  Mark Stoneking,et al.  Detecting heteroplasmy from high-throughput sequencing of complete human mitochondrial DNA genomes. , 2010, American journal of human genetics.

[15]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[16]  Rainer Spang,et al.  ReseqChip: Automated integration of multiple local context probe data from the MitoChip array in mitochondrial DNA sequence assembly , 2009, BMC Bioinformatics.

[17]  R. Fleischmann,et al.  A bioinformatic filter for improved base-call accuracy and polymorphism detection using the Affymetrix GeneChip® whole-genome resequencing platform , 2007, Nucleic Acids Research.

[18]  A Chakravarti,et al.  High-throughput variation detection and genotyping using microarrays. , 2001, Genome research.

[19]  Helen E White,et al.  Accurate detection and quantitation of heteroplasmic mitochondrial point mutations by pyrosequencing. , 2005, Genetic testing.

[20]  Lahiri Kanth Nanduri,et al.  Validation of microarray‐based resequencing of 93 worldwide mitochondrial genomes , 2009, Human mutation.

[21]  R. J. Mitchell,et al.  The Genographic Project Public Participation Mitochondrial DNA Database , 2007, PLoS Genetics.

[22]  R. Boles,et al.  Mitochondrial DNA analysis in clinical laboratory diagnostics. , 2005, Clinica chimica acta; international journal of clinical chemistry.