A Method to Evaluate the Quality of Clinical Gene-Panel Sequencing Data for Single-Nucleotide Variant Detection.

Customized gene-panel tests, based on next-generation sequencing, have demonstrated their usefulness in a plethora of clinical settings. As with other clinical diagnostic techniques, gene-panel sequencing for clinical purposes requires precise quality control (QC) measures to ensure its reliability. Only detected variants are currently recorded in clinical reports; however, identifying whether a nondetected variant is a true or false negative is regarded essential in a clinical setting and, thus, a comprehensive QC measure is in demand. Conventional QC metrics, such as mean coverage and uniformity, are considered inadequate for such an evaluation. As such, a more specific measure focused on clinically important variants is herein proposed. In this study, we suggest a new scoring method for assessing the quality of clinical gene-panel sequencing data, specifically for the detection of a set of single-nucleotide variants. The performance of the method was analyzed using 2295 clinical samples (1012 formalin-fixed, paraffin-embedded and 1283 fresh-frozen tissues), and was shown to provide additional information that conventional methods do not show, such as mean depth and uniformity. Customized sequencing protocols, which include QC criteria, have been optimized by each genomic laboratory. The pass rate scoring method proposed in this study provides an appropriate QC response variable for the customized panel, which strengthens the reliability of calls on clinically relevant variants implicated in clinical reports.

[1]  Mukesh Jain,et al.  NGS QC Toolkit: A Toolkit for Quality Control of Next Generation Sequencing Data , 2012, PloS one.

[2]  Birgit Funke,et al.  College of American Pathologists' laboratory standards for next-generation sequencing clinical tests. , 2015, Archives of pathology & laboratory medicine.

[3]  Robert A. Edwards,et al.  Quality control and preprocessing of metagenomic datasets , 2011, Bioinform..

[4]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[5]  Robert G. Steen,et al.  Comparison of commercially available target enrichment methods for next-generation sequencing. , 2013, Journal of biomolecular techniques : JBT.

[6]  Jian Xu,et al.  QC-Chain: Fast and Holistic Quality Control Method for Next-Generation Sequencing Data , 2013, PloS one.

[7]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[8]  Yan Guo,et al.  Three-stage quality control strategies for DNA re-sequencing data , 2014, Briefings Bioinform..

[9]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[10]  Jiang Li,et al.  The effect of strand bias in Illumina short-read sequencing data , 2012, BMC Genomics.

[11]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[12]  Alex M. Fichtenholtz,et al.  Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing , 2013, Nature Biotechnology.

[13]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[14]  Shashikant Kulkarni,et al.  Assuring the quality of next-generation sequencing in clinical laboratory practice , 2012, Nature Biotechnology.

[15]  A. Sivachenko,et al.  Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples , 2013, Nature Biotechnology.

[16]  Kyoung-Mee Kim,et al.  The minimal amount of starting DNA for Agilent’s hybrid capture-based targeted massively parallel sequencing , 2016, Scientific Reports.