Nature Genetics Advance Online Publication a N a Ly S I S Identification of Significantly Mutated Regions across Cancer Types Highlights a Rich Landscape of Functional Molecular Alterations

Cancer sequencing studies have primarily identified cancer driver genes by the accumulation of protein-altering mutations. An improved method would be annotation independent, sensitive to unknown distributions of functions within proteins and inclusive of noncoding drivers. We employed density-based clustering methods in 2 tumor types to detect variably sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and noncoding elements, including transcription factor binding sites and untranslated regions mutated in up to ~5% of specific tumor types. SMRs demonstrate spatial clustering of alterations in molecular domains and at interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated across tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest that mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally agnostic driver identification. In cancer, driver mutations alter functional elements of diverse nature and size. For example, melanoma drivers include hyperacti-vating mutations mapping to single amino acid residues (for example, BRAF Val600; ref. 1), inactivating mutations along tumor-suppressor exons (for example, in PTEN 1) and regulatory mutations (for example, in the TERT promoter 2). Cancer genomics projects such as The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) have substantially expanded our understanding of the landscape of somatic alterations by identifying frequently mutated protein-coding genes 3–5. However, these studies have focused little attention on systematically analyzing the positional distribution of coding mutations or characterizing noncoding alterations 6. Algorithms to identify cancer driver genes often examine non-synonymous-to-synonymous mutation rates across the gene body or recurrently mutated amino acids called mutation hotspots 5 , as observed in BRAF 7 , IDH1 (ref. 8) and DNA polymerase ε (encoded by POLE) 9. Yet, these analyses ignore recurrent alterations in the vast intermediate scale of functional coding elements, such as those affecting protein subunits or interfaces. Moreover, where mutation clustering within genes has been examined 10–12 , analyses have employed windows of fixed length or identified clusters of non-synonymous mutations, assuming that driver mutations exclusively influence protein sequence and ignoring the importance of exon-embedded regulatory elements 13–18. A significant proportion of regulatory elements in the genome are located proximal to or even in exons 15,19 , suggesting that many may be captured by whole-exome sequencing. Efforts to characterize noncoding regulatory variation …

[1]  Y. Toiyama,et al.  Clinical significance of SNORA42 as an oncogene and a prognostic biomarker in colorectal cancer , 2015, Gut.

[2]  A. Valencia,et al.  Non-coding recurrent mutations in chronic lymphocytic leukaemia , 2015, Nature.

[3]  M. Snyder,et al.  Recurrent Somatic Mutations in Regulatory Regions of Human Cancer Genomes , 2015, Nature Genetics.

[4]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[5]  Alex P. Reynolds,et al.  Cell-of-origin chromatin organization shapes the mutational landscape of cancer , 2015, Nature.

[6]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.

[7]  B. Frey,et al.  The human splicing code reveals new insights into the genetic determinants of disease , 2015, Science.

[8]  F. Supek,et al.  Differential DNA mismatch repair underlies mutation rate variation across the human genome , 2015, Nature.

[9]  Martin A. M. Reijns,et al.  Lagging strand replication shapes the mutational landscape of the genome , 2015, Nature.

[10]  Jean J. Zhao,et al.  PI3K in cancer: divergent roles of isoforms, modes of activation and therapeutic targeting , 2014, Nature Reviews Cancer.

[11]  Benjamin J. Raphael,et al.  Pan-Cancer Network Analysis Identifies Combinations of Rare Somatic Mutations across Pathways and Protein Complexes , 2014, Nature Genetics.

[12]  Elhanan Borenstein,et al.  Conservation of trans-acting circuitry during mammalian regulatory evolution , 2014, Nature.

[13]  E. Larsson,et al.  Systematic analysis of noncoding somatic mutations and gene expression alterations across 14 tumor types , 2014, Nature Genetics.

[14]  S. Gerstberger,et al.  A census of human RNA-binding proteins , 2014, Nature Reviews Genetics.

[15]  Adam Godzik,et al.  e-Driver: a novel method to identify protein regions driving cancer , 2014, Bioinform..

[16]  Zoe Cournia,et al.  Investigating the Structure and Dynamics of the PIK3CA Wild-Type and H1047R Oncogenic Mutant , 2014, PLoS Comput. Biol..

[17]  Akhilesh Pandey,et al.  Activation of diverse signaling pathways by oncogenic PIK3CA mutations , 2014, Nature Communications.

[18]  Ariana Peck,et al.  Structure of the BRAF-MEK complex reveals a kinase activity independent role for BRAF in MAPK signaling. , 2014, Cancer cell.

[19]  C. Sander,et al.  Genome-wide analysis of non-coding regulatory mutations in cancer , 2014, Nature Genetics.

[20]  Daniel L. Mace,et al.  Regulatory analysis of the C. elegans genome with spatiotemporal resolution , 2014, Nature.

[21]  Zhongming Zhao,et al.  Studying tumorigenesis through network evolution and somatic mutational perturbations in the cancer interactome. , 2014, Molecular biology and evolution.

[22]  Peter J. Bickel,et al.  Comparative analysis of regulatory information and circuits across distant species , 2014, Nature.

[23]  Konstantinos J. Mavrakis,et al.  RNA G-quadruplexes cause eIF4A-dependent oncogene translation in cancer , 2014, Nature.

[24]  Benjamin J. Raphael,et al.  Expanding the computational toolbox for mining cancer genomes , 2014, Nature Reviews Genetics.

[25]  Howard Y. Chang,et al.  Quantitative analysis of RNA-protein interactions on a massively parallel array for mapping biophysical and evolutionary landscapes , 2014, Nature Biotechnology.

[26]  J. Valcárcel,et al.  Synonymous Mutations Frequently Act as Driver Mutations in Human Cancers , 2014, Cell.

[27]  Daniel P. Kane,et al.  A common cancer-associated DNA polymerase ε mutation causes an exceptionally strong mutator phenotype, indicating fidelity defects distinct from loss of proofreading. , 2014, Cancer research.

[28]  S. Gabriel,et al.  Discovery and saturation analysis of cancer genes across 21 tumor types , 2014, Nature.

[29]  Mona Singh,et al.  Interaction-based discovery of functionally important genes in cancers , 2013, Nucleic acids research.

[30]  Alex P. Reynolds,et al.  Exonic Transcription Factor Binding Directs Codon Choice and Affects Protein Evolution , 2013, Science.

[31]  P. Knoepfler,et al.  Histone H3.3 mutations: a variant path to cancer. , 2013, Cancer cell.

[32]  D. Xie,et al.  Overexpression of Rab25 contributes to metastasis of bladder cancer through induction of epithelial-mesenchymal transition and activation of Akt/GSK-3β/Snail signaling. , 2013, Carcinogenesis.

[33]  Michael E. Harris,et al.  Hidden specificity in an apparently non-specific RNA-binding protein , 2013, Nature.

[34]  Andrew M. Gross,et al.  Network-based stratification of tumor mutations , 2013, Nature Methods.

[35]  David Tamborero,et al.  OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes , 2013, Bioinform..

[36]  Jun Li,et al.  TCPA: a resource for cancer functional proteomics data , 2013, Nature Methods.

[37]  O. Pardo,et al.  The p90 RSK family members: common functions and isoform specificity. , 2013, Cancer research.

[38]  David T. W. Jones,et al.  Signatures of mutational processes in human cancer , 2013, Nature.

[39]  Steven A. Roberts,et al.  An APOBEC cytidine deaminase mutagenesis pattern is widespread in human cancers , 2013, Nature Genetics.

[40]  Sabine Tejpar,et al.  Gene expression patterns unveil a new level of molecular heterogeneity in colorectal cancer , 2013, The Journal of pathology.

[41]  Ryan M. Layer,et al.  Breakpoint profiling of 64 cancer genomes reveals numerous complex rearrangements spawned by homology-independent mechanisms , 2013, Genome research.

[42]  Joshua M. Stuart,et al.  Integrated genomic characterization of endometrial carcinoma , 2013, Nature.

[43]  S. Fields,et al.  A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function , 2012, Proceedings of the National Academy of Sciences.

[44]  Charles E. Vejnar,et al.  miRmap: Comprehensive prediction of microRNA target repression strength , 2012, Nucleic acids research.

[45]  Glenn R Masson,et al.  Oncogenic mutations mimic and enhance dynamic events in the natural activation of phosphoinositide 3-kinase p110α (PIK3CA) , 2012, Proceedings of the National Academy of Sciences.

[46]  Matthew B. Callaway,et al.  MuSiC: Identifying mutational significance in cancer genomes , 2012, Genome research.

[47]  A. Sivachenko,et al.  A Landscape of Driver Mutations in Melanoma , 2012, Cell.

[48]  Raymond K. Auerbach,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[49]  A. Sivachenko,et al.  Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer , 2012, Nature Genetics.

[50]  A. Ashworth,et al.  The DNA damage response and cancer therapy , 2012, Nature.

[51]  Bin Zhang,et al.  PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse , 2011, Nucleic Acids Res..

[52]  F. Jiang,et al.  Small nucleolar RNA 42 acts as an oncogene in lung tumorigenesis , 2011, Oncogene.

[53]  Brent S. Pedersen,et al.  Pybedtools: a flexible Python library for manipulating genomic datasets and annotations , 2011, Bioinform..

[54]  Scott E. Kern,et al.  Oncogene-induced Nrf2 transcription promotes ROS detoxification and tumorigenesis , 2011, Nature.

[55]  F. P. Roth,et al.  Genome Analysis Reveals Interplay between 5′UTR Introns and Nuclear mRNA Export for Secretory and Mitochondrial Genes , 2011, PLoS genetics.

[56]  K. Jarrod Millman,et al.  Python for Scientists and Engineers , 2011, Comput. Sci. Eng..

[57]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[58]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[59]  B. Berger,et al.  Conserved microRNA targeting in Drosophila is as widespread in coding regions as in 3′UTRs , 2010, Proceedings of the National Academy of Sciences.

[60]  Ozlem Keskin,et al.  Human Cancer Protein-Protein Interaction Network: A Structural Perspective , 2009, PLoS Comput. Biol..

[61]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[62]  D. Busam,et al.  An Integrated Genomic Analysis of Human Glioblastoma Multiforme , 2008, Science.

[63]  Li Zhao,et al.  Helical domain and kinase domain mutations in p110α of phosphatidylinositol 3-kinase induce gain of function by different mechanisms , 2008, Proceedings of the National Academy of Sciences.

[64]  Bert Vogelstein,et al.  The Structure of a Human p110α/p85α Complex Elucidates the Effects of Oncogenic PI3Kα Mutations , 2007, Science.

[65]  Yuval Inbar,et al.  Mechanism of Two Classes of Cancer Mutations in the Phosphoinositide 3-Kinase Catalytic Subunit , 2007, Science.

[66]  Travis E. Oliphant,et al.  Python for Scientific Computing , 2007, Computing in Science & Engineering.

[67]  Tea Lanišnik Rižner,et al.  AKR1C1 and AKR1C3 may determine progesterone and estrogen ratios in endometrial cancer , 2006, Molecular and Cellular Endocrinology.

[68]  T. Golub,et al.  Increased expression of genes converting adrenal androgens to testosterone in androgen-independent prostate cancer. , 2006, Cancer research.

[69]  G. Mills,et al.  The RAB25 small GTPase determines aggressiveness of ovarian and breast cancers , 2004, Nature Medicine.

[70]  F. Stanczyk,et al.  Selective Loss of AKR1C1 and AKR1C2 in Breast Cancer and Their Potential Effect on Progesterone Signaling , 2004, Cancer Research.

[71]  J. Ptak,et al.  High Frequency of Mutations of the PIK3CA Gene in Human Cancers , 2004, Science.

[72]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[73]  A. Nicholson,et al.  Mutations of the BRAF gene in human cancer , 2002, Nature.

[74]  D. Jäger,et al.  Identification of a tissue-specific putative transcription factor in breast tissue by serological screening of a breast cancer library. , 2001, Cancer research.

[75]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[76]  M. Stratton,et al.  A census of amplified and overexpressed human cancer genes , 2010, Nature Reviews Cancer.

[77]  D. Schadendorf,et al.  Highly Recurrent TERT Promoter Mutations in Human Melanoma , 2022 .

[78]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .