High throughput sequencing data analysis workflow: mtDNA variant detection and identification of STR/Y-STR alleles and iso-alleles

Abstract High throughput sequencing of mtDNA and STRs enable forensic laboratories to have the benefits of both analysis methods at the same time. HTS chemistries are more cost effective than Sanger sequencing for the mitochondrial genome and produce data at a greater depth of coverage allowing for detection of low level heteroplasmy [ 1 , 2 , 4 ]. Advantages of HTS STR chemistries over traditional CE include the ability to have smaller amplicons, analyze more loci in each reaction, and the identification of sequence polymorphisms that could, once iso-allele frequencies are available, potentially be processed in mixture software, such as MaSTR™, for deconvolution and LR calculation. Rigorous, user-friendly software is needed in order to analyze the large data files consisting of thousands to millions of reads generated for each sample [ 5 ]. GeneMarker®HTS is a rapid, user-friendly, software for analysis of forensic mtDNA, autosomal and Y-STR high throughput sequencing data. National Institute of Standards and Technology (NIST), in conjunction with Promega corporation, generously supplied fastq sequence files and corresponding CE allele calls for 672 samples amplified with the PowerSeq® Auto/Y System and analyzed on an Illumina® MiSeq. Results of these data analyzed in GeneMarkerHTS software were highly concordant with the CE allele calls. Summary of the STR allele calls concordance and examples of alleles exhibiting sequence variation will be presented. Additionally, a review of the mtDNA genome forensic alignment, heteroplasmy report, import of major variant profile to EMPOP, sample comparison, and database options will be presented.