Development and implementation of an algorithm for detection of protein complexes in large interaction networks

BackgroundAfter complete sequencing of a number of genomes the focus has now turned to proteomics. Advanced proteomics technologies such as two-hybrid assay, mass spectrometry etc. are producing huge data sets of protein-protein interactions which can be portrayed as networks, and one of the burning issues is to find protein complexes in such networks. The enormous size of protein-protein interaction (PPI) networks warrants development of efficient computational methods for extraction of significant complexes.ResultsThis paper presents an algorithm for detection of protein complexes in large interaction networks. In a PPI network, a node represents a protein and an edge represents an interaction. The input to the algorithm is the associated matrix of an interaction network and the outputs are protein complexes. The complexes are determined by way of finding clusters, i. e. the densely connected regions in the network. We also show and analyze some protein complexes generated by the proposed algorithm from typical PPI networks of Escherichia coli and Saccharomyces cerevisiae. A comparison between a PPI and a random network is also performed in the context of the proposed algorithm.ConclusionThe proposed algorithm makes it possible to detect clusters of proteins in PPI networks which mostly represent molecular biological functional units. Therefore, protein complexes determined solely based on interaction data can help us to predict the functions of proteins, and they are also useful to understand and explain certain biological processes.

[1]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[2]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[3]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[4]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[5]  B. Bollobás The evolution of random graphs , 1984 .

[6]  A. Rbnyi ON THE EVOLUTION OF RANDOM GRAPHS , 2001 .

[7]  T. Takagi,et al.  Assessment of prediction accuracy of protein function from protein–protein interaction data , 2001, Yeast.

[8]  Arunabha Sen,et al.  Graph Clustering Using Distance-k Cliques , 1999, GD.

[9]  Guanrong Chen,et al.  Complex networks: small-world, scale-free and beyond , 2003 .

[10]  D. Matula k-Components, Clusters and Slicings in Graphs , 1972 .

[11]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[12]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[13]  Ignacio Marín,et al.  Iterative Cluster Analysis of Protein Interaction Data , 2005, Bioinform..

[14]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[15]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[17]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..