Identifying Protein Complexes Based on Multiple Topological Structures in PPI Networks

Various computational algorithms are developed to identify protein complexes based on only one of specific topological structures in protein-protein interaction (PPI) networks, such as cliques, dense subgraphs, core-attachment structures and starlike structures. However, protein complexes exhibit intricate connections in a PPI network. They cannot be fully detected by only single topological structure. In this paper, we propose an algorithm based on multiple topological structures to identify protein complexes from PPI networks. In the proposed algorithm, four single topological structure based algorithms are first employed to identify raw predictions with specific topological structures, respectively. Those raw predictions are trimmed according to their topological information or GO annotations. Similar results are carefully merged before generating final predictions. Numerical experiments are conducted on a yeast PPI network of DIP and a human PPI network of HPRD. The predicted results show that the multiple topological structure based algorithm can not only obtain a more number of predictions, but also generate results with high accuracy in terms of f-score, matching with known protein complexes and functional enrichments with GO.

[1]  S. Dongen Graph clustering by flow simulation , 2000 .

[2]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[3]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[4]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[5]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[6]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[7]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[8]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[9]  M. Samanta,et al.  Predicting protein functions from redundancies in large-scale protein interaction networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[11]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[12]  See-Kiong Ng,et al.  Interaction graph mining for protein complexes using local clique merging. , 2005, Genome informatics. International Conference on Genome Informatics.

[13]  Takeaki Uno,et al.  Enumeration of condition-dependent dense modules in protein interaction networks , 2009, 21st International Conference on Data Engineering Workshops (ICDEW'05).

[14]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[15]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2006, Nucleic Acids Res..

[16]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[17]  Arun K. Ramani,et al.  How complete are current yeast and human protein-interaction networks? , 2006, Genome Biology.

[18]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[19]  Xiaowei Xu,et al.  A structural approach for finding functional modules from large biological networks , 2008, BMC Bioinformatics.

[20]  James Robert Krycer,et al.  Are protein complexes made of cores, modules and attachments? , 2008, Proteomics.

[21]  Octave Noubibou Doudieu,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[22]  Stijn van Dongen,et al.  Graph Clustering Via a Discrete Uncoupling Process , 2008, SIAM J. Matrix Anal. Appl..

[23]  Kara Dolinski,et al.  Gene Ontology annotations at SGD: new data sources and annotation methods , 2007, Nucleic Acids Res..

[24]  Min Wu,et al.  A core-attachment based method to detect protein complexes in PPI networks , 2009, BMC Bioinformatics.

[25]  Siu-Ming Yiu,et al.  Predicting Protein Complexes from PPI Data: A Core-Attachment Approach , 2009, J. Comput. Biol..

[26]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[27]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[28]  Lin Gao,et al.  Identification of core-attachment complexes based on maximal frequent patterns in protein-protein interaction networks , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[29]  Young-Rae Cho,et al.  Detecting protein complexes and functional modules from protein interaction networks: A graph entropy approach , 2011 .

[30]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[31]  Fang-Xiang Wu,et al.  Not AU protein complexes exhibit dense structures in S. cerevisiae PPI network , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine.

[32]  Alexey I Nesvizhskii,et al.  Computational and informatics strategies for identification of specific protein interaction partners in affinity purification mass spectrometry experiments , 2012, Proteomics.

[33]  Fang-Xiang Wu,et al.  Identifying protein complexes in protein–protein interaction networks by using clique seeds and graph entropy , 2013, Proteomics.