Systematic integration of protein-affecting mutations, gene fusions, and copy number alterations into a comprehensive somatic mutational profile

Somatic mutations occur as random genetic changes in genes through protein-affecting mutations (PAMs), gene fusions, or copy number alterations (CNAs). Mutations of different types can have a similar phenotypic effect (i.e., allelic heterogeneity) and should be integrated into a unified gene mutation profile. We developed OncoMerge to fill this niche of integrating somatic mutations to capture allelic heterogeneity, assign a function to mutations, and overcome known obstacles in cancer genetics. Application of OncoMerge to the TCGA Pan-Cancer Atlas increased the number and frequency of somatically mutated genes and improved the prediction of the somatic mutation role as either activating or loss of function. Using integrated somatic mutation matrices increased the power to infer gene regulatory networks and uncovered the enrichment of switch-like feedback motifs and delay-inducing feed-forward loops. These studies demonstrate that OncoMerge efficiently integrates PAMs, fusions, and CNAs and strengthens downstream analyses linking somatic mutations to cancer phenotypes. Motivation Genes are mutated in tumors through either PAMs, CNAs, or gene fusions, thereby splitting the signal of the somatic mutation effect across these three mutation types. We developed OncoMerge to systematically integrate the three mutation types into a single mutation profile that better captures the impact of somatic mutations on cancer phenotypes. As a tool OncoMerge fills the gap between the sophisticated variant calling pipelines and downstream analyses.

[1]  D. Hanahan Hallmarks of Cancer: New Dimensions. , 2022, Cancer discovery.

[2]  N. Crosetto,et al.  Somatic Copy Number Alterations in Human Cancers: An Analysis of Publicly Available Data From The Cancer Genome Atlas , 2021, Frontiers in Oncology.

[3]  Howard Y. Chang,et al.  Extrachromosomal DNA is associated with oncogene amplification and poor outcome across multiple cancers , 2020, Nature Genetics.

[4]  F. Sanz,et al.  The DisGeNET knowledge platform for disease genomics: 2019 update , 2019, Nucleic Acids Res..

[5]  Icgc,et al.  Pan-cancer analysis of whole genomes , 2017, bioRxiv.

[6]  By Michael Marron-stearns 2019 Update : What to , 2019 .

[7]  C. Cole,et al.  The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers , 2018, Nature Reviews Cancer.

[8]  D. Hanahan,et al.  Pan-Cancer Landscape of Aberrant DNA Methylation across Human Tumors. , 2018, Cell reports.

[9]  Li Ding,et al.  Comprehensive Characterization of Cancer Driver Genes and Mutations (vol 173, 371.e1, 2018) , 2018 .

[10]  P. A. Futreal,et al.  KMT2D/MLL2 inactivation is associated with recurrence in adult-type granulosa cell tumors of the ovary , 2018, Nature Communications.

[11]  Alessandro Pastore,et al.  KMT2C mediates the estrogen dependence of breast cancer through regulation of ERα enhancer function , 2018, Oncogene.

[12]  Adrian V. Lee,et al.  An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics , 2018, Cell.

[13]  Steven J. M. Jones,et al.  Comprehensive Characterization of Cancer Driver Genes and Mutations , 2018, Cell.

[14]  Steven J. M. Jones,et al.  The Immune Landscape of Cancer , 2018, Immunity.

[15]  Li Ding,et al.  Scalable Open Science Approach for Mutation Calling of Tumor Exomes Using Multiple Genomic Pipelines. , 2018, Cell systems.

[16]  T. Hughes,et al.  The Human Transcription Factors , 2018, Cell.

[17]  Ming Tang,et al.  TumorFusions: an integrative resource for cancer-associated transcript fusions , 2017, Nucleic Acids Res..

[18]  Russell Bonneville,et al.  Landscape of Microsatellite Instability Across 39 Cancer Types. , 2017, JCO precision oncology.

[19]  L. Matthews,et al.  Appraising the relevance of DNA copy number loss and gain in prostate cancer using whole genome DNA sequence data , 2017, PLoS genetics.

[20]  Molly Megraw,et al.  IndeCut evaluates performance of network motif discovery algorithms , 2017, bioRxiv.

[21]  Min Zhao,et al.  ONGene: A literature-based database for human oncogenes. , 2017, Journal of genetics and genomics = Yi chuan xue bao.

[22]  Patrick J. Paddison,et al.  Causal Mechanistic Regulatory Network for Glioblastoma Deciphered Using Systems Genetics Network Analysis. , 2016, Cell systems.

[23]  K. Kinzler,et al.  Evaluating the evaluation of cancer driver genes , 2016, Proceedings of the National Academy of Sciences.

[24]  Zhongming Zhao,et al.  TSGene 2.0: an updated literature-based knowledgebase for tumor suppressor genes , 2015, Nucleic Acids Res..

[25]  Xia Li,et al.  Construction and analysis of dynamic transcription factor regulatory networks in the progression of glioma , 2015, Scientific Reports.

[26]  Elhanan Borenstein,et al.  Conservation of trans-acting circuitry during mammalian regulatory evolution , 2014, Nature.

[27]  Steven J. M. Jones,et al.  Integrated Genomic Characterization of Papillary Thyroid Carcinoma , 2014, Cell.

[28]  David Tamborero,et al.  OncodriveROLE classifies cancer driver genes in loss of function and activating mode of action , 2014, Bioinform..

[29]  John N. Weinstein,et al.  PRADA: pipeline for RNA sequencing data analysis , 2014, Bioinform..

[30]  Peter J. Bickel,et al.  Comparative analysis of regulatory information and circuits across distant species , 2014, Nature.

[31]  Daniel A Charlebois,et al.  Coherent feedforward transcriptional regulatory motifs enhance drug resistance. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  S. Gabriel,et al.  Discovery and saturation analysis of cancer genes across 21 tumor types , 2014, Nature.

[33]  S. Elledge,et al.  Cumulative Haploinsufficiency and Triplosensitivity Drive Aneuploidy Patterns and Shape the Cancer Genome , 2013, Cell.

[34]  Michael P. Schroeder,et al.  IntOGen-mutations identifies cancer drivers across tumor types , 2013, Nature Methods.

[35]  David Tamborero,et al.  OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes , 2013, Bioinform..

[36]  Steven A. Roberts,et al.  Mutational heterogeneity in cancer and the search for new cancer genes , 2014 .

[37]  K. Kinzler,et al.  Cancer Genome Landscapes , 2013, Science.

[38]  Edgar Wingender,et al.  TFClass: an expandable hierarchical classification of human transcription factors , 2012, Nucleic Acids Res..

[39]  Steven A. Roberts,et al.  Mutational heterogeneity in cancer and the search for new cancer-associated genes , 2013 .

[40]  Shane J. Neph,et al.  Circuitry and Dynamics of Human Transcription Factor Regulatory Networks , 2012, Cell.

[41]  David Z. Chen,et al.  Architecture of the human regulatory network derived from ENCODE data , 2012, Nature.

[42]  Benjamin E. Gross,et al.  The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. , 2012, Cancer discovery.

[43]  Greg Gibson,et al.  Rare and common variants: twenty arguments , 2012, Nature Reviews Genetics.

[44]  G. Getz,et al.  GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers , 2011, Genome Biology.

[45]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[46]  V. Hinman,et al.  A conserved gene regulatory network subcircuit drives different developmental fates in the vegetal pole of highly divergent echinoderm embryos. , 2010, Developmental biology.

[47]  Derek Y. Chiang,et al.  The landscape of somatic copy-number alteration across human cancers , 2010, Nature.

[48]  J. Paul,et al.  New Dimensions , 2011 .

[49]  Steve Horvath,et al.  A Systems Genetics Approach Implicates USF1, FADS3, and Other Causal Candidate Genes for Familial Combined Hyperlipidemia , 2009, PLoS genetics.

[50]  Steve Horvath,et al.  Using genetic markers to orient the edges in quantitative trait networks: The NEO software , 2008, BMC Systems Biology.

[51]  U. Alon Network motifs: theory and experimental approaches , 2007, Nature Reviews Genetics.

[52]  Sebastian Wernicke,et al.  FANMOD: a tool for fast network motif detection , 2006, Bioinform..

[53]  G. Collins The next generation. , 2006, Scientific American.

[54]  S. Shen-Orr,et al.  Superfamilies of Evolved and Designed Networks , 2004, Science.

[55]  S. Mangan,et al.  Structure and function of the feed-forward loop network motif , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[56]  William B. Kristan,et al.  Faculty Opinions recommendation of Network motifs: simple building blocks of complex networks. , 2002 .

[57]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[58]  L. Hood,et al.  A Genomic Regulatory Network for Development , 2002, Science.

[59]  J. Collins,et al.  Construction of a genetic toggle switch in Escherichia coli , 2000, Nature.

[60]  D. Hanahan,et al.  The Hallmarks of Cancer , 2000, Cell.