Detecting overlapping protein complexes in weighted PPI network based on overlay network chain in quotient space

Protein complexes are the cornerstones of many biological processes and gather them to form various types of molecular machinery that perform a vast array of biological functions. In fact, a protein may belong to multiple protein complexes. Most existing protein complex detection algorithms cannot reflect overlapping protein complexes. To solve this problem, a novel overlapping protein complexes identification algorithm is proposed. In this paper, a new clustering algorithm based on overlay network chain in quotient space, marked as ONCQS, was proposed to detect overlapping protein complexes in weighted PPI networks. In the quotient space, a multilevel overlay network is constructed by using the maximal complete subgraph to mine overlapping protein complexes. The GO annotation data is used to weight the PPI network. According to the compatibility relation, the overlay network chain in quotient space was calculated. The protein complexes are contained in the last level of the overlay network. The experiments were carried out on four PPI databases, and compared ONCQS with five other state-of-the-art methods in the identification of protein complexes. We have applied ONCQS to four PPI databases DIP, Gavin, Krogan and MIPS, the results show that it is superior to other five existing algorithms MCODE, MCL, CORE, ClusterONE and COACH in detecting overlapping protein complexes.

[1]  Xiujuan Lei,et al.  Detecting protein complexes from DPINs by density based clustering with Pigeon-Inspired Optimization Algorithm , 2016, Science China Information Sciences.

[2]  José B. Pereira-Leal The evolutionary origin of protein complexes , 2005, BMC Bioinformatics.

[3]  Kun Li,et al.  Protein complexes identification based on go attributed network embedding , 2018, BMC Bioinformatics.

[4]  Zhu Ping,et al.  Cluster Analysis Based on Fuzzy Quotient Space , 2008 .

[5]  Witold Pedrycz,et al.  Protein complex identification through Markov clustering with firefly algorithm on dynamic protein-protein interaction networks , 2016, Inf. Sci..

[6]  J M Gauthier,et al.  Protein--protein interaction maps: a lead towards cellular functions. , 2001, Trends in genetics : TIG.

[7]  J. Yates,et al.  Direct analysis of protein complexes using mass spectrometry , 1999, Nature Biotechnology.

[8]  Witold Pedrycz,et al.  Topology potential based seed-growth method to identify protein complexes on dynamic PPI data , 2018, Inf. Sci..

[9]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[10]  S. Teichmann,et al.  Protein Complexes Are under Evolutionary Selection to Assemble via Ordered Pathways , 2013, Cell.

[11]  Ling Zhang,et al.  A New Algorithm for Optimal Path Finding in Complex Networks Based on the Quotient Space , 2009, Fundam. Informaticae.

[12]  Jie Zhao,et al.  Mining Overlapping Protein Complexes in PPI Network Based on Granular Computation in Quotient Space , 2018, ICIC.

[13]  Jie Zhao,et al.  Predicting essential proteins based on RNA-Seq, subcellular localization and GO annotation datasets , 2018, Knowl. Based Syst..

[14]  Guimei Liu,et al.  Complex discovery from weighted PPI networks , 2009, Bioinform..

[15]  Xiujuan Lei,et al.  Predicting Protein Complexes in Weighted Dynamic PPI Networks Based on ICSC , 2017, Complex..

[16]  S. Dongen Graph clustering by flow simulation , 2000 .

[17]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[18]  Bo Zhang,et al.  The structure analysis of fuzzy sets , 2005, Int. J. Approx. Reason..

[19]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[20]  Min Wu,et al.  A core-attachment based method to detect protein complexes in PPI networks , 2009, BMC Bioinformatics.

[21]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[22]  Jian Huang,et al.  Regularized gene selection in cancer microarray meta-analysis , 2009, BMC Bioinformatics.

[23]  Tatsuya Akutsu,et al.  Improving prediction of heterodimeric protein complexes using combination with pairwise kernel , 2018, BMC Bioinformatics.

[24]  Hans-Werner Mewes,et al.  MPact: the MIPS protein interaction resource on yeast , 2005, Nucleic Acids Res..

[25]  Ping Zhu,et al.  Cluster Analysis Based on Fuzzy Quotient Space: Cluster Analysis Based on Fuzzy Quotient Space , 2010 .

[26]  Zhang Bo,et al.  Theory of Fuzzy Quotient Space (Methods of Fuzzy Granular Computing) , 2003 .

[27]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..

[28]  Yitian Zhao,et al.  Tracking Nonlinear Correlation for Complex Dynamic Systems Using a Windowed Error Reduction Ratio Method , 2017, Complex..

[29]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[30]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[31]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[32]  Xiujuan Lei,et al.  Neighbor Affinity-Based Core-Attachment Method to Detect Protein Complexes in Dynamic PPI Networks , 2017, Molecules.

[33]  Ronald W. Davis,et al.  Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. , 1999, Science.

[34]  Jiye Liang,et al.  Protein complex detection algorithm based on multiple topological characteristics in PPI networks , 2019, Inf. Sci..

[35]  Xiujuan Lei,et al.  Identification of dynamic protein complexes based on fruit fly optimization algorithm , 2016, Knowl. Based Syst..

[36]  Maurice H. T. Ling,et al.  BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature , 2009, BMC Bioinformatics.

[37]  Bo Xu,et al.  Protein Complex Prediction in Large Ontology Attributed Protein-Protein Interaction Networks , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[38]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[39]  Siu-Ming Yiu,et al.  Predicting Protein Complexes from PPI Data: A Core-Attachment Approach , 2009, J. Comput. Biol..