uKIN Combines New and Prior Information with Guided Network Propagation to Accurately Identify Disease Genes.

Protein interaction networks provide a powerful framework for identifying genes causal for complex genetic diseases. Here, we introduce a general framework, uKIN, that uses prior knowledge of disease-associated genes to guide, within known protein-protein interaction networks, random walks that are initiated from newly identified candidate genes. In large-scale testing across 24 cancer types, we demonstrate that our network propagation approach for integrating both prior and new information not only better identifies cancer driver genes than using either source of information alone but also readily outperforms other state-of-the-art network-based approaches. We also apply our approach to genome-wide association data to identify genes functionally relevant for several complex diseases. Overall, our work suggests that guided network propagation approaches that utilize both prior and new data are a powerful means to identify disease genes. uKIN is freely available for download at: https://github.com/Singh-Lab/uKIN.

[1]  Steven A. Roberts,et al.  Mutational heterogeneity in cancer and the search for new cancer genes , 2014 .

[2]  E. Choi,et al.  Ataxin-1 is involved in tumorigenesis of cervical cancer cells via the EGFR–RAS–MAPK signaling pathway , 2017, Oncotarget.

[3]  Steven A. Roberts,et al.  Mutational heterogeneity in cancer and the search for new cancer-associated genes , 2013 .

[4]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[5]  Eli Upfal,et al.  Algorithms for Detecting Significantly Mutated Pathways in Cancer , 2010, RECOMB.

[6]  François Schiettecatte,et al.  OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders , 2014, Nucleic Acids Res..

[7]  A. Bashashati,et al.  DriverNet: uncovering the impact of somatic driver mutations on transcriptional networks in cancer , 2012, Genome Biology.

[8]  Jing Chen,et al.  Disease candidate gene identification and prioritization using protein interaction networks , 2009, BMC Bioinformatics.

[9]  Maxime W. C. Rousseaux,et al.  ATXN1-CIC Complex Is the Primary Driver of Cerebellar Pathology in Spinocerebellar Ataxia Type 1 through a Gain-of-Function Mechanism , 2018, Neuron.

[10]  Carl Kingsford,et al.  The power of protein interaction networks for associating genes with diseases , 2010, Bioinform..

[11]  Benjamin J. Raphael,et al.  Network propagation: a universal amplifier of genetic associations , 2017, Nature Reviews Genetics.

[12]  Thomas Sauerwald,et al.  HIT'nDRIVE: Multi-driver Gene Prioritization Based on Hitting Time , 2014, RECOMB.

[13]  Takeshi Yoshida,et al.  Nuclear receptor TLX inhibits TGF-β signaling in glioblastoma. , 2016, Experimental cell research.

[14]  E. Marcotte,et al.  Prioritizing candidate disease genes by network-based boosting of genome-wide association data. , 2011, Genome research.

[15]  Trey Ideker,et al.  A Fast and Flexible Framework for Network-Assisted Genomic Association , 2019, iScience.

[16]  Roded Sharan,et al.  Associating Genes and Protein Complexes with Disease via Network Propagation , 2010, PLoS Comput. Biol..

[17]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[18]  David Haussler,et al.  Discovering causal pathways linking genomic events to transcriptional states using Tied Diffusion Through Interacting Events (TieDIE) , 2013, Bioinform..

[19]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[20]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[21]  Benoit H. Dessailly,et al.  Benchmarking network propagation methods for disease gene identification , 2018, bioRxiv.

[22]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Marcel J. T. Reinders,et al.  Detecting recurrent gene mutation in interaction network context using multi-scale graph diffusion , 2013, BMC Bioinformatics.

[24]  E. Lander,et al.  Lessons from the Cancer Genome , 2013, Cell.

[25]  W. Lee,et al.  Crosstalk between CCL7 and CCR3 promotes metastasis of colon cancer cells via ERK-JNK signaling pathways , 2016, Oncotarget.

[26]  K. N. Chandrika,et al.  Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets , 2006, Nature Genetics.

[27]  K. Zhang,et al.  Smurf1 regulates lung cancer cell growth and migration through interaction with and ubiquitination of PIPKIγ , 2017, Oncogene.

[28]  T. Gilliam,et al.  Molecular triangulation: bridging linkage and molecular-network information for identifying candidate genes in Alzheimer's disease. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[29]  C. Sander,et al.  Automated Network Analysis Identifies Core Pathways in Glioblastoma , 2010, PloS one.

[30]  M. Martina,et al.  Mutant ataxin1 disrupts cerebellar development in spinocerebellar ataxia type 1 , 2018, The Journal of clinical investigation.

[31]  Damian Smedley,et al.  Walking the interactome for candidate prioritization in exome sequencing studies of Mendelian diseases , 2014, Bioinform..

[32]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[33]  Noah M. Daniels,et al.  Going the Distance for Protein Function Prediction: A New Distance Metric for Protein Interaction Networks , 2013, PloS one.

[34]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[35]  Lin Gao,et al.  Discovering potential cancer driver genes by an integrated network-based approach. , 2016, Molecular bioSystems.

[36]  Teresa M. Przytycka,et al.  Identifying Causal Genes and Dysregulated Pathways in Complex Diseases , 2011, PLoS Comput. Biol..

[37]  Helen E. Parkinson,et al.  The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019 , 2018, Nucleic Acids Res..

[38]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[39]  Peilin Jia,et al.  VarWalker: Personalized Mutation Network Analysis of Putative Cancer Genes from Next-Generation Sequencing Data , 2014, PLoS Comput. Biol..

[40]  Mehmet Koyutürk,et al.  DADA: Degree-Aware Algorithms for Network-Based Disease Gene Prioritization , 2011, BioData Mining.

[41]  Mona Singh,et al.  Differential analysis between somatic mutation and germline variation profiles reveals cancer-related genes , 2017, Genome Medicine.

[42]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[43]  Daniel E. Carlin,et al.  The Emerging Potential for Network Analysis to Inform Precision Cancer Medicine. , 2018, Journal of molecular biology.

[44]  J. Bader,et al.  Finding friends and enemies in an enemies-only network: a graph diffusion kernel for predicting novel genetic interactions and co-complex membership from yeast genetic interactions. , 2008, Genome research.

[45]  Ryan L. Collins,et al.  Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes , 2019, bioRxiv.

[46]  E. Marcotte,et al.  It's the machine that matters: Predicting gene function and phenotype from protein networks. , 2010, Journal of proteomics.

[47]  Søren Brunak,et al.  Annotation of loci from genome-wide association studies using tissue-specific quantitative interaction proteomics , 2014, Nature Methods.

[48]  T. Przytycka,et al.  Bridging the Gap between Genotype and Phenotype via Network Approaches , 2013, Front. Genet..

[49]  F. Supek,et al.  MUFFINN: cancer gene discovery via network analysis of somatic mutation data , 2016, Genome Biology.

[50]  Benjamin J. Raphael,et al.  Pan-Cancer Network Analysis Identifies Combinations of Rare Somatic Mutations across Pathways and Protein Complexes , 2014, Nature Genetics.

[51]  K. Zhu,et al.  Smad1 promotes colorectal cancer cell migration through Ajuba transactivation , 2017, Oncotarget.

[52]  M. Oti,et al.  The modular nature of genetic diseases , 2006, Clinical genetics.

[53]  Mona Singh,et al.  Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps , 2005, ISMB.

[54]  Roded Sharan,et al.  Network-Based Integration of Disparate Omic Data To Identify "Silent Players" in Cancer , 2015, PLoS Comput. Biol..