Joint clustering of protein interaction networks by block modeling

Identification of functional modules in protein protein interaction (PPI) networks may help better understand cell functions. Many existing computational methods focus on identifying modules based on either individual PPI networks or protein sequence similarities within the species. As both interaction data and sequence similarities may not be either complete or accurate with respect to revealing protein functionalities, we propose a joint clustering framework based on block modeling to integrate the available information across different species to utilize both protein interaction data and sequence similarities. The motivation is to borrow strengths from multiple data sources for more accurate module identification as evolutionally different species may share similar cellular organization. Our blockmodel joint clustering enables the identification of not only densely connected modules but also those modules containing proteins with similar interaction patterns to the rest of the networks. We develop a simulated annealing (SA) algorithm based on Potts-Models for the blockmodel problem to solve the non-convex combinatorial optimization. Our method is validated using synthetic networks as well as yeast and fruit fly PPI networks. The experimental results conclude that joint clustering outperforms clustering of individual networks separately.

[1]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[2]  Martin Vingron,et al.  Ontologizer 2.0 - a multifunctional tool for GO term enrichment analysis and data exploration , 2008, Bioinform..

[3]  Yijie Wang,et al.  A novel subgradient-based optimization algorithm for blockmodel functional module identification , 2013, BMC Bioinformatics.

[4]  Xiaoning Qian,et al.  Comparative Analysis of Biological Networks: Hidden Markov model and Markov chain-based approach , 2012, IEEE Signal Processing Magazine.

[5]  E A Leicht,et al.  Mixture models and exploratory analysis in networks , 2006, Proceedings of the National Academy of Sciences.

[6]  R. Karp,et al.  Conserved pathways within bacteria and yeast as revealed by global protein network alignment , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[8]  Xiaoning Qian,et al.  Comparative Analysis of Biological Networks Using Markov Chains and Hidden Markov Models , 2011 .

[9]  Wojciech Szpankowski,et al.  An efficient algorithm for detecting frequent subgraphs in biological networks , 2004, ISMB/ECCB.

[10]  Yijie Wang,et al.  Functional module identification by block modeling using simulated annealing with path relinking , 2012, BCB.

[11]  Jörg Schultz,et al.  Protein Interaction Networks—More Than Mere Modules , 2008, PLoS Comput. Biol..

[12]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Yijie Wang,et al.  Functional module identification in protein interaction networks by interaction patterns , 2014, Bioinform..

[14]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[15]  Anton J. Enright,et al.  Detection of functional modules from protein interaction networks , 2003, Proteins.

[16]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.