Decomposing PPI networks for complex discovery

BackgroundProtein complexes are important for understanding principles of cellular organization and functions. With the availability of large amounts of high-throughput protein-protein interactions (PPI), many algorithms have been proposed to discover protein complexes from PPI networks. However, existing algorithms generally do not take into consideration the fact that not all the interactions in a PPI network take place at the same time. As a result, predicted complexes often contain many spuriously included proteins, precluding them from matching true complexes.ResultsWe propose two methods to tackle this problem: (1) The localization GO term decomposition method: We utilize cellular component Gene Ontology (GO) terms to decompose PPI networks into several smaller networks such that the proteins in each decomposed network are annotated with the same cellular component GO term. (2) The hub removal method: This method is based on the observation that hub proteins are more likely to fuse clusters that correspond to different complexes. To avoid this, we remove hub proteins from PPI networks, and then apply a complex discovery algorithm on the remaining PPI network. The removed hub proteins are added back to the generated clusters afterwards. We tested the two methods on the yeast PPI network downloaded from BioGRID. Our results show that these methods can improve the performance of several complex discovery algorithms significantly. Further improvement in performance is achieved when we apply them in tandem.ConclusionsThe performance of complex discovery algorithms is hindered by the fact that not all the interactions in a PPI network take place at the same time. We tackle this problem by using localization GO terms or hubs to decompose a PPI network before complex discovery, which achieves considerable improvement.

[1]  Guimei Liu,et al.  Decomposing PPI networks for complex discovery , 2010, BIBM.

[2]  Igor Jurisica,et al.  Functional topology in a network of protein interactions , 2004, Bioinform..

[3]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[4]  Gang Chen,et al.  Modifying the DPClus algorithm for identifying protein complexes based on new topological structures , 2008, BMC Bioinformatics.

[5]  Limsoon Wong,et al.  Protein complex prediction based on k-connected subgraphs in protein interaction network , 2010, BMC Systems Biology.

[6]  P. Bork,et al.  Structure-Based Assembly of Protein Complexes in Yeast , 2004, Science.

[7]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[8]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2005, Nucleic Acids Res..

[9]  Limsoon Wong,et al.  Using Indirect protein-protein Interactions for protein Complex Prediction , 2008, J. Bioinform. Comput. Biol..

[10]  Limsoon Wong,et al.  Using indirect protein-protein interactions for protein complex predication. , 2007, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[11]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[12]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[13]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[14]  S. Dongen Graph clustering by flow simulation , 2000 .

[15]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[16]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[17]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[18]  Guimei Liu,et al.  Complex discovery from weighted PPI networks , 2009, Bioinform..