A Fast Agglomerative Community Detection Method for Protein Complex Discovery in Protein Interaction Networks

Proteins are known to interact with each other by forming protein complexes and in order to perform specific biological functions. Many community detection methods have been devised for the discovery of protein complexes in protein interaction networks. One common problem in current agglomerative community detection approaches is that vertices with just one neighbor are often classified as separate clusters, which does not make sense for complex identification. Also, a major limitation of agglomerative techniques is that their computational efficiency do not scale well to large protein interaction networks (PINs). In this paper, we propose a new agglomerative algorithm, FAC-PIN, based on a local premetric of relative vertex-to-vertex clustering value and which addresses the above two issues. Our proposed FAC-PIN method is applied to eight PINs from different species, and the identified complexes are validated using experimentally verified complexes. The preliminary computational results show that FAC-PIN can discover protein complexes from PINs more accurately and faster than the HC-PIN and CNM algorithms, the current state-of-the-art agglomerative approaches to complex prediction.

[1]  Jianer Chen,et al.  A Fast Agglomerate Algorithm for Mining Functional Modules in Protein Interaction Networks , 2008, 2008 International Conference on BioMedical Engineering and Informatics.

[2]  Aidong Zhang,et al.  A “Seed-Refine” Algorithm for Detecting Protein Complexes From Protein Interaction Data , 2007, IEEE Transactions on NanoBioscience.

[3]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[4]  Caroline C. Friedel,et al.  Inferring topology from clustering coefficients in protein-protein interaction networks , 2006, BMC Bioinformatics.

[5]  Shi-Hua Zhang,et al.  Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures , 2008, Neurocomputing.

[6]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[8]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Ron Shamir,et al.  A clustering algorithm based on graph connectivity , 2000, Inf. Process. Lett..

[10]  A. Barabasi,et al.  Functional and topological characterization of protein interaction networks , 2004, Proteomics.

[11]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Yi Pan,et al.  A Fast Hierarchical Clustering Algorithm for Functional Modules Discovery in Protein Interaction Networks , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Limsoon Wong,et al.  Using Indirect protein-protein Interactions for protein Complex Prediction , 2008, J. Bioinform. Comput. Biol..

[14]  Elena Marchiori,et al.  Robust Community Detection Methods with Resolution Parameter for Complex Detection in Protein Protein Interaction Networks , 2012, PRIB.

[15]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[16]  Feng Luo,et al.  Modular organization of protein interaction networks , 2007, Bioinform..

[17]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[18]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[19]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  See-Kiong Ng,et al.  Interaction graph mining for protein complexes using local clique merging. , 2005, Genome informatics. International Conference on Genome Informatics.