Protein complex detection algorithm based on multiple topological characteristics in PPI networks

Abstract Detecting protein complexes from available protein–protein interaction (PPI) networks is an important task, and several related algorithms have been proposed. These algorithms usually consider a single topological metric and ignore the rich topological characteristics and inherent organization information of protein complexes. However, the effective use of such information is crucial to protein complex detection. To overcome this deficiency, this study presents a heuristic clustering algorithm to identify protein complexes by fully exploiting the topological information of PPI networks. By considering the clustering coefficient and the node degree, a new nodal metric is proposed to quantify the importance of each node within a local subgraph. An iterative paradigm is used to incrementally identify seed proteins and expand each seed to a cluster. First, among the unclustered nodes, the node with the highest nodal metric is selected as a new seed. Then, the seed is expanded to a cluster by adding candidate nodes recursively from its neighbors according to both the density of the cluster and the connection between a candidate node and the cluster. The experimental results demonstrate that the proposed algorithm outperforms other competing algorithms in terms of F-measure and accuracy.

[1]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[2]  Yike Guo,et al.  Fast graph clustering with a new description model for community detection , 2017, Inf. Sci..

[3]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[4]  Xiaoli Li,et al.  Computational approaches for detecting protein complexes from protein interaction networks: a survey , 2010, BMC Genomics.

[5]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[6]  Gary D Bader,et al.  A Combined Experimental and Computational Strategy to Define Protein Interaction Networks for Peptide Recognition Modules , 2001, Science.

[7]  Peng Jiang,et al.  SPICi: a fast clustering algorithm for large biological networks , 2010, Bioinform..

[8]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[9]  Cheng Liang,et al.  MOEPGA: A novel method to detect protein complexes in yeast protein-protein interaction networks based on MultiObjective Evolutionary Programming Genetic Algorithm , 2015, Comput. Biol. Chem..

[10]  O. Keskin,et al.  Predicting Protein-Protein Interactions from the Molecular to the Proteome Level. , 2016, Chemical reviews.

[11]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Johannes Goll,et al.  Protein interaction data curation: the International Molecular Exchange (IMEx) consortium , 2012, Nature Methods.

[13]  Rosy Das Sarmah,et al.  Weighted edge based clustering to identify protein complexes in protein-protein interaction networks incorporating gene expression profile , 2016, Comput. Biol. Chem..

[14]  Nazar Zaki,et al.  Protein complex detection using interaction reliability assessment and weighted clustering coefficient , 2013, BMC Bioinformatics.

[15]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[16]  Chin-Teng Lin,et al.  A review of clustering techniques and developments , 2017, Neurocomputing.

[17]  Charu C. Aggarwal,et al.  Data Clustering , 2013 .

[18]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[19]  Srinivasan Parthasarathy,et al.  Identifying functional modules in interaction networks through overlapping Markov clustering , 2012, Bioinform..

[20]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[22]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Yi Pan,et al.  Detecting Protein Complexes Based on Uncertain Graph Model , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[25]  Xiujuan Lei,et al.  Neighbor Affinity-Based Core-Attachment Method to Detect Protein Complexes in Dynamic PPI Networks , 2017, Molecules.

[26]  Inderjit S. Dhillon,et al.  Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion , 2015, IEEE Transactions on Knowledge and Data Engineering.

[27]  Jiye Liang,et al.  A sequential ensemble clusterings generation algorithm for mixed data , 2018, Appl. Math. Comput..

[28]  Jie Wang,et al.  A Seed Expansion Graph Clustering Method for Protein Complexes Detection in Protein Interaction Networks , 2017, Molecules.

[29]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[30]  F. Zare-Mirakabad,et al.  WCOACH: Protein complex prediction in weighted PPI networks. , 2015, Genes & genetic systems.

[31]  Min Wu,et al.  A core-attachment based method to detect protein complexes in PPI networks , 2009, BMC Bioinformatics.

[32]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[33]  Guimei Liu,et al.  Complex discovery from weighted PPI networks , 2009, Bioinform..

[34]  Moataz A. Ahmed,et al.  Protein complexes predictions within protein interaction networks using genetic algorithms , 2016, BMC Bioinformatics.

[35]  Sourav S. Bhowmick,et al.  Clustering and Summarizing Protein-Protein Interaction Networks: A Survey , 2016, IEEE Transactions on Knowledge and Data Engineering.

[36]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[37]  Lin Gao,et al.  Predicting protein complexes in protein interaction networks using a core-attachment algorithm based on graph communicability , 2012, Inf. Sci..

[38]  S. Teichmann,et al.  Protein Complexes Are under Evolutionary Selection to Assemble via Ordered Pathways , 2013, Cell.

[39]  Lusheng Wang,et al.  Identification of Protein Complexes Using Weighted PageRank-Nibble Algorithm and Core-Attachment Structure , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[40]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[41]  C. F. Jeff Wu,et al.  Experiments , 2021, Wiley Series in Probability and Statistics.

[42]  Bo Xu,et al.  The impact of protein interaction networks' characteristics on computational complex detection methods. , 2018, Journal of theoretical biology.

[43]  Gang Chen,et al.  Modifying the DPClus algorithm for identifying protein complexes based on new topological structures , 2008, BMC Bioinformatics.

[44]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[45]  Spiridon D. Likothanassis,et al.  Predicting overlapping protein complexes from weighted protein interaction graphs by gradually expanding dense neighborhoods , 2016, Artif. Intell. Medicine.

[46]  Devin K. Schweppe,et al.  Architecture of the human interactome defines protein communities and disease networks , 2017, Nature.

[47]  Xiujuan Lei,et al.  Predicting Protein Complexes in Weighted Dynamic PPI Networks Based on ICSC , 2017, Complex..

[48]  Siu-Ming Yiu,et al.  Predicting Protein Complexes from PPI Data: A Core-Attachment Approach , 2009, J. Comput. Biol..

[49]  Massimiliano Zanin,et al.  Combining complex networks and data mining: why and how , 2016 .

[50]  Anton J. Enright,et al.  Detection of functional modules from protein interaction networks , 2003, Proteins.