Interaction-based discovery of functionally important genes in cancers

A major challenge in cancer genomics is uncovering genes with an active role in tumorigenesis from a potentially large pool of mutated genes across patient samples. Here we focus on the interactions that proteins make with nucleic acids, small molecules, ions and peptides, and show that residues within proteins that are involved in these interactions are more frequently affected by mutations observed in large-scale cancer genomic data than are other residues. We leverage this observation to predict genes that play a functionally important role in cancers by introducing a computational pipeline (http://canbind.princeton.edu) for mapping large-scale cancer exome data across patients onto protein structures, and automatically extracting proteins with an enriched number of mutations affecting their nucleic acid, small molecule, ion or peptide binding sites. Using this computational approach, we show that many previously known genes implicated in cancers are enriched in mutations within the binding sites of their encoded proteins. By focusing on functionally relevant portions of proteins—specifically those known to be involved in molecular interactions—our approach is particularly well suited to detect infrequent mutations that may nonetheless be important in cancer, and should aid in expanding our functional understanding of the genomic landscape of cancer.

[1]  G. Mills,et al.  Whole-exome sequencing combined with functional genomics reveals novel candidate driver cancer genes in endometrial cancer , 2012, Genome research.

[2]  B. Peters,et al.  Distinguishing cancer-associated missense mutations from common polymorphisms. , 2007, Cancer research.

[3]  J. Carpten,et al.  Clonal competition with alternating dominance in multiple myeloma. , 2012, Blood.

[4]  J. Harper,et al.  Structure of a -TrCP1-Skp1--Catenin Complex: Destruction Motif Binding and Lysine Specificity , 2003 .

[5]  Maria Deak,et al.  High-Resolution Structure of the Pleckstrin Homology Domain of Protein Kinase B/Akt Bound to Phosphatidylinositol (3,4,5)-Trisphosphate , 2002, Current Biology.

[6]  T. Golub,et al.  Impaired microRNA processing enhances cellular transformation and tumorigenesis , 2007, Nature Genetics.

[7]  Mona Singh,et al.  Disentangling function from topology to infer the network properties of disease genes , 2013, BMC Systems Biology.

[8]  Gary D Bader,et al.  Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers , 2013 .

[9]  Jose M. Duarte,et al.  The structural impact of cancer-associated missense mutations in oncogenes and tumor suppressors , 2011, Molecular Cancer.

[10]  Andrea Ciliberto,et al.  Low duplicability and network fragility of cancer genes. , 2008, Trends in genetics : TIG.

[11]  Yoshitaka Narita,et al.  Tumor heterogeneity is an active process maintained by a mutant EGFR-induced cytokine circuit in glioblastoma. , 2010, Genes & development.

[12]  F. Ferrari,et al.  A MicroRNA Targeting Dicer for Metastasis Control , 2010, Cell.

[13]  Steven Henikoff,et al.  SIFT: predicting amino acid changes that affect protein function , 2003, Nucleic Acids Res..

[14]  B. Samuelsson,et al.  Ribonuclease activity and RNA binding of recombinant human Dicer , 2002, The EMBO journal.

[15]  E. Lander,et al.  Lessons from the Cancer Genome , 2013, Cell.

[16]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[17]  A. Sivachenko,et al.  Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer , 2012, Nature Genetics.

[18]  Elias Campo Guerri,et al.  International network of cancer genome projects , 2010 .

[19]  David Haussler,et al.  PARADIGM-SHIFT predicts the function of mutations in multiple cancers using pathway impact analysis , 2012, Bioinform..

[20]  D. Higgins,et al.  Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega , 2011, Molecular systems biology.

[21]  E. Birney,et al.  Patterns of somatic mutation in human cancer genomes , 2007, Nature.

[22]  A. Ben-Ze'ev,et al.  beta-Catenin signaling in biological control and cancer. , 2007, Journal of cellular biochemistry.

[23]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[24]  M. Nakao,et al.  Methylation-Mediated Transcriptional Silencing in Euchromatin by Methyl-CpG Binding Protein MBD1 Isoforms , 1999, Molecular and Cellular Biology.

[25]  Yan Cui,et al.  Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information , 2005, Bioinform..

[26]  Koji Nagata,et al.  Homodimeric structure and double-stranded RNA cleavage activity of the C-terminal RNase III domain of human dicer. , 2007, Journal of molecular biology.

[27]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[28]  A. Gonzalez-Perez,et al.  Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation , 2012, Genome Medicine.

[29]  Yang Zhang,et al.  BioLiP: a semi-manually curated database for biologically relevant ligand–protein interactions , 2012, Nucleic Acids Res..

[30]  Spyro Mousses,et al.  A transforming mutation in the pleckstrin homology domain of AKT1 in cancer , 2007, Nature.

[31]  Bonnie Berger,et al.  iWRAP: An interface threading approach with application to prediction of cancer-related protein-protein interactions. , 2010, Journal of molecular biology.

[32]  Gary D Bader,et al.  International network of cancer genome projects , 2010, Nature.

[33]  Leyla Isik,et al.  Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. , 2009, Cancer research.

[34]  Paul A. Bates,et al.  Global topological features of cancer proteins in the human interactome , 2006, Bioinform..

[35]  P. Morin,et al.  β‐catenin signaling and cancer , 1999 .

[36]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[37]  David Haussler,et al.  LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources , 2005, Bioinform..

[38]  H. Do,et al.  Rarity of AKT1 and AKT3 E17K mutations in squamous cell carcinoma of lung , 2010, Cell cycle.

[39]  H. Ohtsuki,et al.  Accumulation of driver and passenger mutations during tumor progression , 2009, Proceedings of the National Academy of Sciences.

[40]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.

[41]  Dong-Yeon Cho,et al.  Dissecting cancer heterogeneity with a probabilistic genotype-phenotype model , 2013, RECOMB.

[42]  A. Sparks,et al.  The Genomic Landscapes of Human Breast and Colorectal Cancers , 2007, Science.

[43]  Frances M. G. Pearl,et al.  MoKCa database—mutations of kinases in cancer , 2008, Nucleic Acids Res..

[44]  D. Huntsman,et al.  Cancer‐associated somatic DICER1 hotspot mutations cause defective miRNA processing and reverse‐strand expression bias to predominantly mature 3p strands through loss of 5p strand cleavage , 2013, The Journal of pathology.

[45]  J. Hopper,et al.  Rare, evolutionarily unlikely missense substitutions in CHEK2 contribute to breast cancer susceptibility: results from a breast cancer family registry case-control mutation-screening study , 2011, Breast Cancer Research.

[46]  Geng Wu,et al.  Structure of a beta-TrCP1-Skp1-beta-catenin complex: destruction motif binding and lysine specificity of the SCF(beta-TrCP1) ubiquitin ligase. , 2003, Molecular cell.

[47]  Brian H. Dunford-Shore,et al.  Somatic mutations affect key pathways in lung adenocarcinoma , 2008, Nature.

[48]  M. Guyer,et al.  Charting a course for genomic medicine from base pairs to bedside , 2011, Nature.

[49]  Eli Upfal,et al.  De Novo Discovery of Mutated Driver Pathways in Cancer , 2011, RECOMB.

[50]  E. Birney,et al.  Patterns of somatic mutation in human cancer genomes , 2007, Nature.

[51]  R. Altman,et al.  A new disease-specific machine learning approach for the prediction of cancer-causing missense variants. , 2011, Genomics.

[52]  M. Stratton,et al.  The cancer genome , 2009, Nature.

[53]  R. Pearson,et al.  A specific role for AKT3 in the genesis of ovarian cancer through modulation of G(2)-M phase transition. , 2006, Cancer research.

[54]  Y. Xiong,et al.  HOS, a human homolog of Slimb, forms an SCF complex with Skp1 and Cullin1 and targets the phosphorylation-dependent degradation of IkappaB and beta-catenin. , 1999, Oncogene.

[55]  S. Sommer The importance of immune gene variability (MHC) in evolutionary ecology and conservation , 2005, Frontiers in Zoology.

[56]  Tom R. Gaunt,et al.  Predicting the functional consequences of cancer-associated amino acid substitutions , 2013, Bioinform..

[57]  Geng Wu,et al.  Structure of a -TrCP1-Skp1--Catenin Complex , 2003 .

[58]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[59]  Yu Liu,et al.  Gene interaction enrichment and network analysis to identify dysregulated pathways and their interactions in complex diseases , 2012, BMC Systems Biology.

[60]  Predrag Radivojac,et al.  Gain and Loss of Phosphorylation Sites in Human Cancer , 2022 .

[61]  Benjamin A. Shoemaker,et al.  Cancer Missense Mutations Alter Binding Properties of Proteins and Their Interaction Networks , 2013, PloS one.

[62]  Richard Simon,et al.  Identifying cancer driver genes in tumor genome sequencing studies , 2011, Bioinform..

[63]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[64]  Thomas A. Peterson,et al.  Domain landscapes of somatic mutations in cancer. , 2013, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[65]  J. Moult,et al.  Identification and analysis of deleterious human SNPs. , 2006, Journal of molecular biology.

[66]  Hoyun Lee,et al.  The Akt isoforms are present at distinct subcellular locations. , 2010, American journal of physiology. Cell physiology.

[67]  Ozlem Keskin,et al.  Human Cancer Protein-Protein Interaction Network: A Structural Perspective , 2009, PLoS Comput. Biol..

[68]  C. Sander,et al.  Predicting the functional impact of protein mutations: application to cancer genomics , 2011, Nucleic acids research.

[69]  G. Parmigiani,et al.  The Consensus Coding Sequences of Human Breast and Colorectal Cancers , 2006, Science.

[70]  M. Kimmel,et al.  Conflict of interest statement. None declared. , 2010 .

[71]  P. Polakis Wnt signaling and cancer. , 2000, Genes & development.