Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples

Detection of somatic point substitutions is a key step in characterizing the cancer genome. However, existing methods typically miss low-allelic-fraction mutations that occur in only a subset of the sequenced cells owing to either tumor heterogeneity or contamination by normal cells. Here we present MuTect, a method that applies a Bayesian classifier to detect somatic mutations with very low allele fractions, requiring only a few supporting reads, followed by carefully tuned filters that ensure high specificity. We also describe benchmarking approaches that use real, rather than simulated, sequencing data to evaluate the sensitivity and specificity as a function of sequencing depth, base quality and allelic fraction. Compared with other methods, MuTect has higher sensitivity with similar specificity, especially for mutations with allelic fractions as low as 0.1 and below, making MuTect particularly useful for studying cancer subclones and their evolution in standard exome and genome sequencing data.

[1]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[2]  Guy Cavet,et al.  Comment on "The Consensus Coding Sequences of Human Breast and Colorectal Cancers" , 2007, Science.

[3]  R. Tibshirani,et al.  Comment on "The Consensus Coding Sequences of Human Breast and Colorectal Cancers" , 2007, Science.

[4]  Joshua M. Korn,et al.  Comprehensive genomic characterization defines human glioblastoma genes and core pathways , 2008, Nature.

[5]  Brian H. Dunford-Shore,et al.  Somatic mutations affect key pathways in lung adenocarcinoma , 2008, Nature.

[6]  Ryan D. Morin,et al.  Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution , 2009, Nature.

[7]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[8]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[9]  J. Uhm Comprehensive genomic characterization defines human glioblastoma genes and core pathways , 2009 .

[10]  A. Gnirke,et al.  High-quality draft assemblies of mammalian genomes from massively parallel sequence data , 2010, Proceedings of the National Academy of Sciences.

[11]  Andrew Menzies,et al.  The patterns and dynamics of genomic instability in metastatic pancreatic cancer , 2010, Nature.

[12]  klaguia International Network of Cancer Genome Projects , 2010 .

[13]  M. Gönen,et al.  Cellular and genetic diversity in the progression of in situ human breast carcinomas to an invasive phenotype. , 2010, The Journal of clinical investigation.

[14]  M. Nowak,et al.  Distant Metastasis Occurs Late during the Genetic Evolution of Pancreatic Cancer , 2010, Nature.

[15]  Gary D Bader,et al.  International network of cancer genome projects , 2010, Nature.

[16]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[17]  Tom Royce,et al.  A comprehensive catalogue of somatic mutations from a human cancer genome , 2010, Nature.

[18]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[19]  Kristian Cibulskis,et al.  ContEst: estimating cross-contamination of human samples in next-generation sequencing data , 2011, Bioinform..

[20]  Trevor J Pugh,et al.  Initial genome sequencing and analysis of multiple myeloma , 2011, Nature.

[21]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[22]  Emmanouil Collab A map of human genome variation from population-scale sequencing , 2011, Nature.

[23]  Kristian Cibulskis,et al.  Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion , 2011, Nature Genetics.

[24]  Eric S. Lander,et al.  SF 3 B 1 and Other Novel Cancer Genes in Chronic Lymphocytic Leukemia , 2011 .

[25]  A. Sivachenko,et al.  SF3B1 and other novel cancer genes in chronic lymphocytic leukemia. , 2011, The New England journal of medicine.

[26]  J. Troge,et al.  Tumour evolution inferred by single-cell sequencing , 2011, Nature.

[27]  A. McKenna,et al.  The Mutational Landscape of Head and Neck Squamous Cell Carcinoma , 2011, Science.

[28]  Eric S. Lander,et al.  The genomic complexity of primary human prostate cancer , 2010, Nature.

[29]  Mingming Jia,et al.  COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer , 2010, Nucleic Acids Res..

[30]  Jian Li,et al.  Temporal dissection of tumorigenesis in primary cancers. , 2011, Cancer discovery.

[31]  A. Sivachenko,et al.  Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer , 2012, Nature Genetics.

[32]  Huanming Yang,et al.  Single-Cell Exome Sequencing and Monoclonal Evolution of a JAK2-Negative Myeloproliferative Neoplasm , 2012, Cell.

[33]  A. Børresen-Dale,et al.  The Life History of 21 Breast Cancers , 2012, Cell.

[34]  Sohrab P. Shah,et al.  JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumour paired next-generation sequencing data , 2012, Bioinform..

[35]  Steven J. M. Jones,et al.  Comprehensive molecular characterization of human colon and rectal cancer , 2012, Nature.

[36]  Huanming Yang,et al.  Single-Cell Exome Sequencing Reveals Single-Nucleotide Mutation Characteristics of a Kidney Tumor , 2012, Cell.

[37]  Wendy S. W. Wong,et al.  Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs , 2012, Bioinform..

[38]  Steven J. M. Jones,et al.  Comprehensive genomic characterization of squamous cell lung cancers , 2012, Nature.

[39]  Kristian Cibulskis,et al.  A remarkably simple genome underlies highly malignant pediatric rhabdoid cancers. , 2012, The Journal of clinical investigation.

[40]  A. McKenna,et al.  Absolute quantification of somatic DNA alterations in human cancer , 2012, Nature Biotechnology.

[41]  Ken Chen,et al.  Clonal architecture of secondary acute myeloid leukemia. , 2012, The New England journal of medicine.

[42]  Joshua F. McMichael,et al.  Clonal evolution in relapsed acute myeloid leukemia revealed by whole genome sequencing , 2011, Nature.

[43]  A. Sivachenko,et al.  A Landscape of Driver Mutations in Melanoma , 2012, Cell.

[44]  T. Fennell,et al.  Melanoma genome sequencing reveals frequent PREX2 mutations , 2012, Nature.

[45]  Jill P. Mesirov,et al.  MEDULLOBLASTOMA EXOME SEQUENCING UNCOVERS SUBTYPE-SPECIFIC SOMATIC MUTATIONS , 2012, Nature.

[46]  Eric S. Lander,et al.  Discovery and prioritization of somatic mutations in diffuse large B-cell lymphoma (DLBCL) by whole-exome sequencing , 2012, Proceedings of the National Academy of Sciences.

[47]  A. Sivachenko,et al.  Sequence analysis of mutations and translocations across breast cancer subtypes , 2012, Nature.

[48]  Angela N. Brooks,et al.  Mapping the Hallmarks of Lung Adenocarcinoma with Massively Parallel Sequencing , 2012, Cell.

[49]  Ken Chen,et al.  SomaticSniper: identification of somatic point mutations in whole genome sequencing data , 2012, Bioinform..

[50]  Derek Y. Chiang,et al.  Mutations in Isocitrate Dehydrogenase 1 and 2 Occur Frequently in Intrahepatic Cholangiocarcinomas and Share Hypermethylation Targets with Glioblastomas , 2012, Oncogene.

[51]  S. Altmeyer-Morel,et al.  CD24−/low stem-like breast cancer marker defines the radiation-resistant cells involved in memorization and transmission of radiation-induced genomic instability , 2013, Oncogene.

[52]  A. McKenna,et al.  Evolution and Impact of Subclonal Mutations in Chronic Lymphocytic Leukemia , 2012, Cell.