An effective approach to detecting both small and large complexes from protein-protein interaction networks

BackgroundPredicting protein complexes from protein-protein interaction (PPI) networks has been studied for decade. Various methods have been proposed to address some challenging issues of this problem, including overlapping clusters, high false positive/negative rates of PPI data and diverse complex structures. It is well known that most current methods can detect effectively only complexes of size ≥3, which account for only about half of the total existing complexes. Recently, a method was proposed specifically for finding small complexes (size = 2 and 3) from PPI networks. However, up to now there is no effective approach that can predict both small (size ≤ 3) and large (size >3) complexes from PPI networks.ResultsIn this paper, we propose a novel method, called CPredictor2.0, that can detect both small and large complexes under a unified framework. Concretely, we first group proteins of similar functions. Then, the Markov clustering algorithm is employed to discover clusters in each group. Finally, we merge all discovered clusters that overlap with each other to a certain degree, and the merged clusters as well as the remaining clusters constitute the set of detected complexes. Extensive experiments have shown that the new method can more effectively predict both small and large complexes, in comparison with the state-of-the-art methods.ConclusionsThe proposed method, CPredictor2.0, can be applied to accurately predict both small and large protein complexes.

[1]  Fang-Xiang Wu,et al.  Identifying Protein Complexes Based on Multiple Topological Structures in PPI Networks , 2013, IEEE Transactions on NanoBioscience.

[2]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[3]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[4]  Zelmina Lubovac,et al.  Combining functional and topological properties to identify core modules in protein interaction networks , 2006, Proteins.

[5]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[6]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[7]  H. Mewes,et al.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. , 2004, Nucleic acids research.

[8]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 1999, Nucleic Acids Res..

[9]  Q. Zou,et al.  A novel machine learning method for cytokine-receptor interaction prediction. , 2016, Combinatorial chemistry & high throughput screening.

[10]  Bo Xu,et al.  Ontology integration to identify protein complex in protein interaction networks , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[11]  Anton J. Enright,et al.  Detection of functional modules from protein interaction networks , 2003, Proteins.

[12]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[13]  Siu-Ming Yiu,et al.  Predicting Protein Complexes from PPI Data: A Core-Attachment Approach , 2009, J. Comput. Biol..

[14]  Bin Xu,et al.  From Function to Interaction: A New Paradigm for Accurately Predicting Protein Complexes Based on Protein-to-Protein Interaction Networks , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  Min Wu,et al.  A core-attachment based method to detect protein complexes in PPI networks , 2009, BMC Bioinformatics.

[16]  Yanjun Qi,et al.  Protein complex identification by supervised graph local clustering , 2008, ISMB.

[17]  Guimei Liu,et al.  Complex discovery from weighted PPI networks , 2009, Bioinform..

[18]  Anastasios Bezerianos,et al.  Growing functional modules from a seed protein via integration of protein interaction and gene expression data , 2007, BMC Bioinformatics.

[19]  Alexander Bockmayr,et al.  Double and multiple knockout simulations for genome-scale metabolic network reconstructions , 2015, Algorithms for Molecular Biology.

[20]  Yi Pan,et al.  Detecting Protein Complexes Based on Uncertain Graph Model , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[22]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[23]  Xiangxiang Zeng,et al.  nDNA-prot: identification of DNA-binding proteins based on unbalanced classification , 2014, BMC Bioinformatics.

[24]  Srinivasan Parthasarathy,et al.  Improving Functional Modularity in Protein-Protein Interactions Graphs Using Hub-Induced Subgraphs , 2006, PKDD.

[25]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Shoshana J. Wodak,et al.  CYGD: the Comprehensive Yeast Genome Database , 2004, Nucleic Acids Res..

[27]  Tao Jiang,et al.  A max-flow based approach to the identification of protein complexes using protein interaction and microarray data. , 2008, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[28]  See-Kiong Ng,et al.  Interaction graph mining for protein complexes using local clique merging. , 2005, Genome informatics. International Conference on Genome Informatics.

[29]  Fang-Xiang Wu,et al.  Identifying protein complexes and functional modules - from static PPI networks to dynamic PPI networks , 2014, Briefings Bioinform..

[30]  Michael C. Schatz,et al.  Revealing Biological Modules via Graph Summarization , 2009, J. Comput. Biol..

[31]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[32]  Yi Pan,et al.  Detecting conserved protein complexes using a dividing-and-matching algorithm and unequally lenient criteria for network comparison , 2015, Algorithms for Molecular Biology.

[33]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[34]  Sean R. Collins,et al.  Toward a Comprehensive Atlas of the Physical Interactome of Saccharomyces cerevisiae*S , 2007, Molecular & Cellular Proteomics.

[35]  Limsoon Wong,et al.  Discovery of small protein complexes from PPI networks with size-specific supervised weighting , 2014, BMC Systems Biology.

[36]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[37]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[38]  Ron Shamir,et al.  Identification of functional modules using network topology and high-throughput data , 2007, BMC Systems Biology.