Joint clustering of protein interaction networks through Markov random walk

Biological networks obtained by high-throughput profiling or human curation are typically noisy. For functional module identification, single network clustering algorithms may not yield accurate and robust results. In order to borrow information across multiple sources to alleviate such problems due to data quality, we propose a new joint network clustering algorithm ASModel in this paper. We construct an integrated network to combine network topological information based on protein-protein interaction (PPI) datasets and homological information introduced by constituent similarity between proteins across networks. A novel random walk strategy on the integrated network is developed for joint network clustering and an optimization problem is formulated by searching for low conductance sets defined on the derived transition matrix of the random walk, which fuses both topology and homology information. The optimization problem of joint clustering is solved by a derived spectral clustering algorithm. Network clustering using several state-of-the-art algorithms has been implemented to both PPI networks within the same species (two yeast PPI networks and two human PPI networks) and those from different species (a yeast PPI network and a human PPI network). Experimental results demonstrate that ASModel outperforms the existing single network clustering algorithms as well as another recent joint clustering algorithm in terms of complex prediction and Gene Ontology (GO) enrichment analysis.

[1]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[2]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[4]  S. Dongen A cluster algorithm for graphs , 2000 .

[5]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[6]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[7]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[8]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[10]  Anton J. Enright,et al.  Detection of functional modules from protein interaction networks , 2003, Proteins.

[11]  J. F. Poyatos,et al.  How biologically relevant are interaction-based modules in protein networks? , 2004, Genome Biology.

[12]  Purvesh Khatri,et al.  Onto-Tools: an ensemble of web-accessible, ontology-based tools for the functional design and interpretation of high-throughput gene expression experiments , 2004, Nucleic Acids Res..

[13]  Diego di Bernardo,et al.  Robust Identification of Large Genetic Networks , 2003, Pacific Symposium on Biocomputing.

[14]  Michael Y. Galperin,et al.  Identification and functional analysis of ‘hypothetical’ genes expressed in Haemophilus influenzae , 2004 .

[15]  A. Emili,et al.  Interaction network containing conserved and essential protein complexes in Escherichia coli , 2005, Nature.

[16]  T. Ideker,et al.  Systematic interpretation of genetic interactions using protein networks , 2005, Nature Biotechnology.

[17]  Tobias Müller,et al.  Identifying functional modules in protein–protein interaction networks: an integrated exact approach , 2008, ISMB.

[18]  Octave Noubibou Doudieu,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[19]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2008 update , 2008, Nucleic Acids Res..

[20]  Kara Dolinski,et al.  Gene Ontology annotations at SGD: new data sources and annotation methods , 2007, Nucleic Acids Res..

[21]  Geoffrey J. Barton,et al.  PIPs: human protein–protein interaction prediction database , 2008, Nucleic Acids Res..

[22]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[23]  Srinivasan Parthasarathy,et al.  Scalable graph clustering using stochastic flows: applications to community discovery , 2009, KDD.

[24]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[25]  Karthik Raman,et al.  Construction and analysis of protein–protein interaction networks , 2010, Automated experimentation.

[26]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[27]  Srinivasan Parthasarathy,et al.  Markov clustering of protein interaction networks with improved balance and scalability , 2010, BCB '10.

[28]  Srinivasan Parthasarathy,et al.  Symmetrizations for clustering directed graphs , 2011, EDBT/ICDT '11.

[29]  E. Fikrig,et al.  Molecular Interactions that Enable Movement of the Lyme Disease Agent from the Tick Gut into the Hemolymph , 2011, PLoS pathogens.

[30]  Shinya Honda,et al.  Convergent evolution in structural elements of proteins investigated using cross profile analysis , 2012, BMC Bioinformatics.

[31]  M. Cannataro,et al.  AlignNemo: A Local Network Alignment Method to Integrate Homology and Topology , 2012, PloS one.

[32]  Srinivasan Parthasarathy,et al.  Identifying functional modules in interaction networks through overlapping Markov clustering , 2012, Bioinform..

[33]  Srinivasan Parthasarathy,et al.  Scalable global alignment for multiple biological networks , 2012, BMC Bioinformatics.

[34]  Qingyang Wang,et al.  Disruption of TAB1/p38α interaction using a cell-permeable peptide limits myocardial ischemia/reperfusion injury. , 2013, Molecular therapy : the journal of the American Society of Gene Therapy.

[35]  Yijie Wang,et al.  Functional module identification in protein interaction networks by interaction patterns , 2014, Bioinform..

[36]  R. Tsien,et al.  Specificity and Stability in Topology of Protein Networks , 2022 .