Systematic characterization of pan‐cancer mutation clusters

Cancer genome sequencing has shown that driver genes can often be distinguished not only by the elevated mutation frequency but also by specific nucleotide positions that accumulate changes at a high rate. However, properties associated with a residue's potential to drive tumorigenesis when mutated have not yet been systematically investigated. Here, using a novel methodological approach, we identify and characterize a compendium of 180 hotspot residues within 160 human proteins which occur with a significant frequency and are likely to have functionally relevant impact. We find that such mutations (i) are more prominent in proteins that can exist in the on and off state, (ii) reflect the identity of a tumor of origin, and (iii) often localize within interfaces which mediate interactions with other proteins or ligands. Following, we further examine structural data for human protein complexes and identify a number of additional protein interfaces that accumulate cancer mutations at a high rate. Jointly, these analyses suggest that disruption and dysregulation of protein interactions can be instrumental in switching functions of cancer proteins and activating downstream changes.

[1]  Mariano J. Alvarez,et al.  Identification of Causal Genetic Drivers of Human Disease through Systems-Level Analysis of Regulatory Networks , 2014, Cell.

[2]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[3]  Michael P Snyder,et al.  Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations , 2015, Nature Genetics.

[4]  Joshua F. McMichael,et al.  DGIdb - Mining the druggable genome , 2013, Nature Methods.

[5]  The Cancer Genome Atlas Research Network Comprehensive molecular characterization of urothelial bladder carcinoma , 2014, Nature.

[6]  J. Li,et al.  PCBP1 suppresses the translation of metastasis-associated PRL-3 phosphatase. , 2010, Cancer cell.

[7]  W. Klapper,et al.  The PCBP1 gene encoding poly(rc) binding protein i is recurrently mutated in Burkitt lymphoma , 2015, Genes, chromosomes & cancer.

[8]  U. Moll,et al.  Two hot spot mutant p53 mouse models display differential gain of function in tumorigenesis , 2013, Cell Death and Differentiation.

[9]  D. Pe’er,et al.  Integration of Genomic Data Enables Selective Discovery of Breast Cancer Drivers , 2014, Cell.

[10]  Ludovic C. Gillet,et al.  Quantifying protein interaction dynamics by SWATH mass spectrometry: application to the 14-3-3 system , 2013, Nature Methods.

[11]  Glenn R Masson,et al.  Oncogenic mutations mimic and enhance dynamic events in the natural activation of phosphoinositide 3-kinase p110α (PIK3CA) , 2012, Proceedings of the National Academy of Sciences.

[12]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[13]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.

[14]  Robert D. Finn,et al.  iPfam: a database of protein family and domain interactions found in the Protein Data Bank , 2013, Nucleic Acids Res..

[15]  Ralf Herwig,et al.  The ConsensusPathDB interaction database: 2013 update , 2012, Nucleic Acids Res..

[16]  K. Nakayama,et al.  Ubiquitin ligases: cell-cycle control and cancer , 2006, Nature Reviews Cancer.

[17]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[18]  Elizabeth Brunk,et al.  Mapping genetic variations to three-dimensional protein structures to enhance variant interpretation: a proposed framework , 2017, Genome Medicine.

[19]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[20]  Wolfgang Huber,et al.  Mapping genetic interactions in human cancer cells with RNAi and multiparametric phenotyping , 2013, Nature Methods.

[21]  K. Henrick,et al.  Inference of macromolecular assemblies from crystalline state. , 2007, Journal of molecular biology.

[22]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[23]  Henning Hermjakob,et al.  The Reactome pathway knowledgebase , 2013, Nucleic Acids Res..

[24]  J. Miller,et al.  Predicting the Functional Effect of Amino Acid Substitutions and Indels , 2012, PloS one.

[25]  M. Rubin,et al.  SPOP Mutation Drives Prostate Tumorigenesis In Vivo through Coordinate Regulation of PI3K/mTOR and AR Signaling. , 2017, Cancer cell.

[26]  Ruedi Aebersold,et al.  Mass-spectrometric exploration of proteome structure and function , 2016, Nature.

[27]  Zsuzsanna Dosztányi,et al.  IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content , 2005, Bioinform..

[28]  Steven J. M. Jones,et al.  Comprehensive molecular characterization of urothelial bladder carcinoma , 2014, Nature.

[29]  M. Boutros,et al.  Microscopy-Based High-Content Screening , 2015, Cell.

[30]  Benjamin J. Raphael,et al.  Mutational landscape and significance across 12 major cancer types , 2013, Nature.

[31]  David T. W. Jones,et al.  Signatures of mutational processes in human cancer , 2013, Nature.

[32]  N. Socci,et al.  Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity , 2015, Nature Biotechnology.

[33]  Alexander Hoischen,et al.  Prioritization of neurodevelopmental disease genes by discovery of new mutations , 2014, Nature Neuroscience.

[34]  Laurie J. Gay,et al.  Contribution of platelets to tumour metastasis , 2011, Nature Reviews Cancer.

[35]  Xiao-Min Wang,et al.  Possible novel roles of poly(rC)-binding protein 1 in SH-SY5Y neurocytes: an analysis using a dynamic Bayesian network , 2012, Neuroscience Bulletin.

[36]  A. Godzik,et al.  Comparison of algorithms for the detection of cancer drivers at subgene resolution , 2017, Nature Methods.

[37]  H. Carter,et al.  Structure-Based Analysis Reveals Cancer Missense Mutations Target Protein Interaction Interfaces , 2016, PloS one.

[38]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[39]  Juancarlos Chan,et al.  Gene Ontology Consortium: going forward , 2014, Nucleic Acids Res..

[40]  Vivien Marx,et al.  Cancer genomes: discerning drivers from passengers , 2014, Nature Methods.

[41]  Mariano J. Alvarez,et al.  Network-based inference of protein activity helps functionalize the genetic landscape of cancer , 2016, Nature Genetics.

[42]  P. Aloy,et al.  Interactome3D: adding structural details to protein networks , 2013, Nature Methods.

[43]  David L. Masica,et al.  Exome-Scale Discovery of Hotspot Mutation Regions in Human Cancer Using 3D Protein Structure. , 2016, Cancer research.

[44]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[45]  S. Dietmann,et al.  Genetic Exploration of the Exit from Self-Renewal Using Haploid Embryonic Stem Cells , 2014, Cell stem cell.

[46]  C. Garvie,et al.  Structural studies of Ets-1/Pax5 complex formation on DNA. , 2001, Molecular cell.

[47]  Toshio Kuroki,et al.  Role of Smad4 (DPC4) inactivation in human cancer. , 2003, Biochemical and biophysical research communications.

[48]  Bin Zhang,et al.  PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse , 2011, Nucleic Acids Res..

[49]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[50]  Hidetoshi Shimodaira,et al.  Pvclust: an R package for assessing the uncertainty in hierarchical clustering , 2006, Bioinform..

[51]  Matthew B. Callaway,et al.  MuSiC: Identifying mutational significance in cancer genomes , 2012, Genome research.

[52]  Andrew M. Gross,et al.  Network-based stratification of tumor mutations , 2013, Nature Methods.

[53]  Benjamin J. Raphael,et al.  Multiplatform Analysis of 12 Cancer Types Reveals Molecular Classification within and across Tissues of Origin , 2014, Cell.

[54]  Meifeng Zhou,et al.  Downregulated Poly-C binding protein-1 is a novel predictor associated with poor prognosis in Acute Myeloid Leukemia , 2015, Diagnostic Pathology.

[55]  A. Barabasi,et al.  Uncovering disease-disease relationships through the incomplete interactome , 2015, Science.

[56]  B. Ebert,et al.  Mutations in G protein beta subunits promote transformation and kinase inhibitor resistance , 2014, Nature Medicine.

[57]  Fan Yang,et al.  Protein Domain-Level Landscape of Cancer-Type-Specific Somatic Mutations , 2015, PLoS Comput. Biol..

[58]  C. Chothia,et al.  Evolution of the Protein Repertoire , 2003, Science.

[59]  T. Jacks,et al.  Targeted point mutations of p53 lead to dominant-negative inhibition of wild-type p53 function , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[60]  M. Madan Babu,et al.  Molecular Principles of Gene Fusion Mediated Rewiring of Protein Interaction Networks in Cancer , 2016, Molecular cell.

[61]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[62]  Michael C. Heinold,et al.  A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing , 2015, Nature Communications.

[63]  Mingming Jia,et al.  COSMIC: somatic cancer genetics at high-resolution , 2016, Nucleic Acids Res..

[64]  Gary D Bader,et al.  Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers , 2013 .

[65]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[66]  David Tamborero,et al.  OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes , 2013, Bioinform..

[67]  S. Gabriel,et al.  Discovery and saturation analysis of cancer genes across 21 tumor types , 2014, Nature.

[68]  Jonathan W. Pillow,et al.  POSTER PRESENTATION Open Access , 2013 .

[69]  Jared C. Roach,et al.  Kaviar: an accessible system for testing SNV novelty , 2011, Bioinform..

[70]  C. Chothia The nature of the accessible and buried surfaces in proteins. , 1976, Journal of molecular biology.

[71]  B. Clurman,et al.  FBW7 ubiquitin ligase: a tumour suppressor at the crossroads of cell division, growth and differentiation , 2008, Nature Reviews Cancer.

[72]  Chris Sander,et al.  Pan-Cancer Analysis of Mutation Hotspots in Protein Domains. , 2015, Cell systems.

[73]  Gary D Bader,et al.  International network of cancer genome projects , 2010, Nature.

[74]  Daniel R. Zerbino,et al.  Ensembl 2016 , 2015, Nucleic Acids Res..

[75]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[76]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.

[77]  Robert D. Finn,et al.  The Pfam protein families database: towards a more sustainable future , 2015, Nucleic Acids Res..

[78]  Søren Brunak,et al.  Annotation of loci from genome-wide association studies using tissue-specific quantitative interaction proteomics , 2014, Nature Methods.

[79]  Benjamin J. Raphael,et al.  Pan-Cancer Network Analysis Identifies Combinations of Rare Somatic Mutations across Pathways and Protein Complexes , 2014, Nature Genetics.

[80]  C. Sander,et al.  3D clusters of somatic mutations in cancer reveal numerous rare mutations as functional targets , 2017, Genome Medicine.

[81]  Joaquín Dopazo,et al.  A Pan-Cancer Catalogue of Cancer Driver Protein Interaction Interfaces , 2015, PLoS Comput. Biol..

[82]  Syed Haider,et al.  Ensembl BioMarts: a hub for data retrieval across taxonomic space , 2011, Database J. Biol. Databases Curation.

[83]  J. Valcárcel,et al.  Synonymous Mutations Frequently Act as Driver Mutations in Human Cancers , 2014, Cell.

[84]  E. Lander,et al.  Lessons from the Cancer Genome , 2013, Cell.

[85]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[86]  John G. Collard,et al.  Rho GTPases: functions and association with cancer , 2007, Clinical & Experimental Metastasis.

[87]  E. Levy A simple definition of structural regions in proteins and its use in analyzing interface evolution. , 2010, Journal of molecular biology.

[88]  E. Lander,et al.  Comprehensive assessment of cancer missense mutation clustering in protein structures , 2015, Proceedings of the National Academy of Sciences.

[89]  Steven J. M. Jones,et al.  Comprehensive molecular profiling of lung adenocarcinoma , 2014, Nature.

[90]  Laurence A. Turka,et al.  Cancer-Associated PTEN Mutants Act in a Dominant-Negative Manner to Suppress PTEN Protein Function , 2014, Cell.

[91]  Amos Bairoch,et al.  The ENZYME database in 2000 , 2000, Nucleic Acids Res..

[92]  M. Taketo,et al.  Gastrointestinal tumorigenesis in Smad4 (Dpc4) mutant mice. , 2000, Human cell.

[93]  Haiyuan Yu,et al.  Three-dimensional reconstruction of protein networks provides insight into human genetic disease , 2012, Nature Biotechnology.

[94]  Steven A. Roberts,et al.  Mutational heterogeneity in cancer and the search for new cancer-associated genes , 2013 .

[95]  S. Elledge,et al.  Cumulative Haploinsufficiency and Triplosensitivity Drive Aneuploidy Patterns and Shape the Cancer Genome , 2013, Cell.