VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research

Abstract Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research.

[1]  Xiaoyu Chen,et al.  Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications , 2016, Bioinform..

[2]  Ratna Dua Puri,et al.  Next Generation Sequencing in the Clinic , 2016, The Indian Journal of Pediatrics.

[3]  A. Marchetti,et al.  Early Prediction of Response to Tyrosine Kinase Inhibitors by Quantification of EGFR Mutations in Plasma of NSCLC Patients , 2015, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[4]  Hugo Y. K. Lam,et al.  An ensemble approach to accurately detect somatic mutations using SomaticSeq , 2015, Genome Biology.

[5]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[6]  Joshua M. Stuart,et al.  Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection , 2015, Nature Methods.

[7]  Aleksandra Markovets,et al.  Acquired EGFR C797S mediates resistance to AZD9291 in advanced non-small cell lung cancer harboring EGFR T790M , 2015, Nature Medicine.

[8]  Mingming Jia,et al.  COSMIC: exploring the world's knowledge of somatic mutations in human cancer , 2014, Nucleic Acids Res..

[9]  J. McPherson,et al.  A defining decade in DNA sequencing , 2014, Nature Methods.

[10]  M. Schatz,et al.  Accurate detection of de novo and transmitted indels within exome-capture data using micro-assembly , 2014, Nature Methods.

[11]  Steven J. M. Jones,et al.  Comprehensive molecular profiling of lung adenocarcinoma , 2014, Nature.

[12]  K. Frazer,et al.  Evaluation of ultra-deep targeted sequencing for personalized breast cancer care , 2013, Breast Cancer Research.

[13]  Alex M. Fichtenholtz,et al.  Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing , 2013, Nature Biotechnology.

[14]  Benjamin E. Gross,et al.  Integrative Analysis of Complex Cancer Genomics and Clinical Profiles Using the cBioPortal , 2013, Science Signaling.

[15]  Ryan M. Layer,et al.  LUMPY: a probabilistic framework for structural variant discovery , 2012, Genome Biology.

[16]  F. Cappuzzo,et al.  Identifying and Targeting ROS1 Gene Fusions in Non–Small Cell Lung Cancer , 2012, Clinical Cancer Research.

[17]  E. Rosenthal,et al.  Clinical significance of large rearrangements in BRCA1 and BRCA2 , 2012, Cancer.

[18]  D. Matei,et al.  Olaparib maintenance therapy in platinum-sensitive relapsed ovarian cancer. , 2012, The New England journal of medicine.

[19]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[20]  Christopher A. Miller,et al.  VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. , 2012, Genome research.

[21]  Leping Li,et al.  ART: a next-generation sequencing read simulator , 2012, Bioinform..

[22]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[23]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer , 2011, Nature Biotechnology.

[24]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[25]  A. Bardelli,et al.  Molecular mechanisms of resistance to cetuximab and panitumumab in colorectal cancer. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[26]  Kai Ye,et al.  Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads , 2009, Bioinform..

[27]  S. Salzberg,et al.  TopHat: discovering splice junctions with RNA-Seq , 2009, Bioinform..

[28]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[29]  Dongsheng Tu,et al.  K-ras mutations and benefit from cetuximab in advanced colorectal cancer. , 2008, The New England journal of medicine.

[30]  H. Aburatani,et al.  Identification of the transforming EML4–ALK fusion gene in non-small-cell lung cancer , 2007, Nature.

[31]  R. Wilson,et al.  EGF receptor gene mutations are common in lung cancers from "never smokers" and are associated with sensitivity of tumors to gefitinib and erlotinib. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Patricia L. Harris,et al.  Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib. , 2004, The New England journal of medicine.

[33]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[34]  Wei Wu,et al.  Next Generation Sequencing in Cancer Research , 2013, Springer New York.

[35]  Claude-Alain H. Roten,et al.  Fast and accurate short read alignment with Burrows–Wheeler transform , 2009, Bioinform..