Random Walk With Restart on Multiplex and Heterogeneous Biological Networks

Motivation Recent years have witnessed an exponential growth in the number of identified interactions between biological molecules. These interactions are usually represented as large and complex networks, calling for the development of appropriated tools to exploit the functional information they contain. Random walk with restart (RWR) is the state‐of‐the‐art guilt‐by‐association approach. It explores the network vicinity of gene/protein seeds to study their functions, based on the premise that nodes related to similar functions tend to lie close to each other in the networks. Results In this study, we extended the RWR algorithm to multiplex and heterogeneous networks. The walk can now explore different layers of physical and functional interactions between genes and proteins, such as protein‐protein interactions and co‐expression associations. In addition, the walk can also jump to a network containing different sets of edges and nodes, such as phenotype similarities between diseases. We devised a leave‐one‐out cross‐validation strategy to evaluate the algorithms abilities to predict disease‐associated genes. We demonstrate the increased performances of the multiplex‐heterogeneous RWR as compared to several random walks on monoplex or heterogeneous networks. Overall, our framework is able to leverage the different interaction sources to outperform current approaches. Finally, we applied the algorithm to predict candidate genes for the Wiedemann‐Rautenstrauch syndrome, and to explore the network vicinity of the SHORT syndrome. Availability and implementation The source code is available on GitHub at: https://github.com/alberto‐valdeolivas/RWR‐MH. In addition, an R package is freely available through Bioconductor at: http://bioconductor.org/packages/RandomWalkRestartMH/. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Patrick Callier,et al.  PIK3R1 mutations cause syndromic insulin resistance with lipoatrophy. , 2013, American journal of human genetics.

[2]  E. Marcotte,et al.  Prioritizing candidate disease genes by network-based boosting of genome-wide association data. , 2011, Genome research.

[3]  Pierre Cau,et al.  Molecular bases of progeroid syndromes. , 2006, Human molecular genetics.

[4]  Hui Liu,et al.  Screening lifespan-extending drugs in Caenorhabditis elegans via label propagation on drug-protein networks , 2016, BMC Systems Biology.

[5]  Vito Latora,et al.  Efficient exploration of multiplex networks , 2015, 1505.01378.

[6]  Sangkeun Lee,et al.  PathRank: Ranking nodes on a heterogeneous graph for flexible hybrid recommender systems , 2013, Expert Syst. Appl..

[7]  Jean-Philippe Vert,et al.  ProDiGe: Prioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples , 2011, BMC Bioinformatics.

[8]  Jay D Humphrey,et al.  Inhibition of MicroRNA-29 Enhances Elastin Levels in Cells Haploinsufficient for Elastin and in Bioengineered Vessels—Brief Report , 2012, Arteriosclerosis, thrombosis, and vascular biology.

[9]  Jagdish Chandra Patra,et al.  Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network , 2010, Bioinform..

[10]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[11]  Y. Usta,et al.  Wiedemann–Rautenstrauch syndrome: Report of a variant case , 2012, American journal of medical genetics. Part A.

[12]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2004, Nucleic Acids Res..

[13]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[14]  C. Wijmenga,et al.  Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. , 2006, American journal of human genetics.

[15]  Christos Faloutsos,et al.  Automatic multimedia cross-modal correlation discovery , 2004, KDD.

[16]  H M Thomas,et al.  Observation of particle pairing in a two-dimensional plasma crystal. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Lionel Spinelli,et al.  Extreme multifunctional proteins identified from a human protein interaction network , 2015, Nature Communications.

[18]  Vito Latora,et al.  Characteristic times of biased random walks on complex networks , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Saurabh Sinha,et al.  Characterizing gene sets using discriminative random walks with restart on heterogeneous biological networks , 2016, Bioinform..

[20]  Carlo M. Croce,et al.  Disruption of miR-29 Leads to Aberrant Differentiation of Smooth Muscle Cells Selectively Associated with Distal Lung Vasculature , 2015, PLoS genetics.

[21]  Trey Ideker,et al.  Genotype to phenotype via network analysis. , 2013, Current opinion in genetics & development.

[22]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[23]  G. Arboleda,et al.  The neonatal progeroid syndrome (Wiedemann–Rautenstrauch): A model for the study of human aging? , 2007, Experimental Gerontology.

[24]  A. Barabasi,et al.  Uncovering disease-disease relationships through the incomplete interactome , 2015, Science.

[25]  Mason A. Porter,et al.  Multilayer networks , 2013, J. Complex Networks.

[26]  Bridget E. Begg,et al.  A Proteome-Scale Map of the Human Interactome Network , 2014, Cell.

[27]  Damian Smedley,et al.  The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data , 2014, Nucleic Acids Res..

[28]  Sylvia Richardson,et al.  Phenotype Similarity Regression for Identifying the Genetic Determinants of Rare Diseases , 2016, American journal of human genetics.

[29]  Ambuj K. Singh,et al.  RRW: repeated random walks on genome-scale protein networks for local cluster discovery , 2009, BMC Bioinformatics.

[30]  Xing Chen,et al.  Drug-target interaction prediction by random walk on the heterogeneous network. , 2012, Molecular bioSystems.

[31]  A Munshi,et al.  Leopard syndrome--report of a variant case. , 1999, Journal of the Indian Society of Pedodontics and Preventive Dentistry.

[32]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[33]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[34]  Damian Smedley,et al.  Walking the interactome for candidate prioritization in exome sequencing studies of Mendelian diseases , 2014, Bioinform..

[35]  B. Snel,et al.  Predicting disease genes using protein–protein interactions , 2006, Journal of Medical Genetics.

[36]  Damian Smedley,et al.  Next-generation diagnostics and disease-gene discovery with the Exomiser , 2015, Nature Protocols.

[37]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[38]  Patrick Thiran,et al.  Layered complex networks. , 2006, Physical review letters.

[39]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[40]  Stefan Johansson,et al.  SHORT syndrome with partial lipodystrophy due to impaired phosphatidylinositol 3 kinase signaling. , 2013, American journal of human genetics.

[41]  Jason Y. Liu,et al.  Analysis of protein sequence and interaction data for candidate disease gene prediction , 2006, Nucleic acids research.

[42]  Jinyan Li,et al.  Disease gene identification by random walk on multigraphs merging heterogeneous genomic and phenotype data , 2012, BMC Genomics.

[43]  M. Oti,et al.  The modular nature of genetic diseases , 2006, Clinical genetics.

[44]  Henning Hermjakob,et al.  The Reactome pathway knowledgebase , 2013, Nucleic Acids Res..

[45]  L. Asz Random Walks on Graphs: a Survey , 2022 .

[46]  Hongbo Shi,et al.  Large-scale identification of adverse drug reaction-related proteins through a random walk model , 2016, Scientific Reports.

[47]  Johannes Goll,et al.  A new reference implementation of the PSICQUIC web service , 2013, Nucleic Acids Res..

[48]  H. Toriello,et al.  Wiedemann-Rautenstrauch syndrome. , 1990, Journal of medical genetics.

[49]  A. Barabasi,et al.  A Protein–Protein Interaction Network for Human Inherited Ataxias and Disorders of Purkinje Cell Degeneration , 2006, Cell.

[50]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[51]  Gabriele Sales,et al.  graphite - a Bioconductor package to convert pathway topology to gene network , 2012, BMC Bioinformatics.

[52]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[53]  Roded Sharan,et al.  Associating Genes and Protein Complexes with Disease via Network Propagation , 2010, PLoS Comput. Biol..

[54]  R. Kuang,et al.  Network-based Phenome-Genome Association Prediction by Bi-Random Walk , 2015, PloS one.

[55]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[56]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[57]  K. Gunsalus,et al.  Network modeling links breast cancer susceptibility and centrosome dysfunction. , 2007, Nature genetics.

[58]  A. Arenas,et al.  Mathematical Formulation of Multilayer Networks , 2013, 1307.4977.

[59]  Jinyan Li,et al.  Laplacian normalization and random walk on heterogeneous networks for disease-gene prioritization , 2015, Comput. Biol. Chem..

[60]  J. Hou,et al.  Natural course of neonatal progeroid syndrome. , 2009, Pediatrics and neonatology.

[61]  Guillermo Suñé,et al.  Systematic identification of molecular links between core and candidate genes in breast cancer. , 2015, Journal of molecular biology.

[62]  Núria Queralt-Rosinach,et al.  DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants , 2016, Nucleic Acids Res..

[63]  R. Quick,et al.  Thermal Conductivity of Copper Part II. Conductivity at Low Temperatures , 1895 .

[64]  E. Birney,et al.  Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt , 2009, Nature Protocols.

[65]  R. Sharan,et al.  Protein networks in disease. , 2008, Genome research.

[66]  Pall I. Olason,et al.  A human phenome-interactome network of protein complexes implicated in genetic disorders , 2007, Nature Biotechnology.

[67]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[68]  Mehmet Koyutürk,et al.  DADA: Degree-Aware Algorithms for Network-Based Disease Gene Prioritization , 2011, BioData Mining.

[69]  Sahar Mansour,et al.  Mutations in PIK3R1 cause SHORT syndrome. , 2013, American journal of human genetics.

[70]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.

[71]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[72]  Anaïs Baudot,et al.  The Functional Landscape of Hsp27 Reveals New Cellular Processes such as DNA Repair and Alternative Splicing and Proposes Novel Anticancer Targets* , 2014, Molecular & Cellular Proteomics.

[73]  Gilles Didier,et al.  Identifying communities from multiplex biological networks , 2015, PeerJ.

[74]  Sylvia Richardson,et al.  Human phenotype ontology annotation and cluster analysis to unravel genetic defects in 707 cases with unexplained bleeding and platelet disorders , 2015, Genome Medicine.

[75]  Albert Solé-Ribalta,et al.  Navigability of interconnected networks under random failures , 2013, Proceedings of the National Academy of Sciences.

[76]  Vito Latora,et al.  Structural measures for multiplex networks. , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[77]  C. Fagour,et al.  Clinical reappraisal of SHORT syndrome with PIK3R1 mutations: toward recommendation for molecular testing and management , 2016, Clinical genetics.