Overlapping functional modules detection in PPI network with pair‐wise constrained non‐negative matrix tri‐factorisation

A large amount of available protein-protein interaction (PPI) data has been generated by high-throughput experimental techniques. Uncovering functional modules from PPI networks will help us better understand the underlying mechanisms of cellular functions. Numerous computational algorithms have been designed to identify functional modules automatically in the past decades. However, most community detection methods (non-overlapping or overlapping types) are unsupervised models, which cannot incorporate the well-known protein complexes as a priori. The authors propose a novel semi-supervised model named pairwise constrains nonnegative matrix tri-factorisation (PCNMTF), which takes full advantage of the well-known protein complexes to find overlapping functional modules based on protein module indicator matrix and module correlation matrix simultaneously from PPI networks. PCNMTF determinately models and learns the mixed module memberships of each protein by considering the correlation among modules simultaneously based on the non-negative matrix tri-factorisation. The experiment results on both synthetic and real-world biological networks demonstrate that PCNMTF gains more precise functional modules than that of state-of-the-art methods.

[1]  Srinivasan Parthasarathy,et al.  Identifying functional modules in interaction networks through overlapping Markov clustering , 2012, Bioinform..

[2]  Stephen Roberts,et al.  Overlapping community detection using Bayesian non-negative matrix factorization. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Young-Rae Cho,et al.  Detecting protein complexes and functional modules from protein interaction networks: A graph entropy approach , 2011 .

[4]  Mark N. Wass,et al.  Challenges for the prediction of macromolecular interactions. , 2011, Current opinion in structural biology.

[5]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  G. Wagner,et al.  The road to modularity , 2007, Nature Reviews Genetics.

[7]  Yang Xiang,et al.  Predicting glioblastoma prognosis networks using weighted gene co-expression network analysis on TCGA data , 2012, BMC Bioinformatics.

[8]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[10]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[11]  Shi-Hua Zhang,et al.  Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures , 2008, Neurocomputing.

[12]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[13]  Shuang Wu,et al.  Clustering and overlapping modules detection in PPI network based on IBFO , 2013, Proteomics.

[14]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Pablo Tamayo,et al.  Metagenes and molecular pattern discovery using matrix factorization , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Jian Yu,et al.  A parameter-free community detection method based on centrality and dispersion of nodes in complex networks , 2015 .

[17]  Xiaochun Cao,et al.  A Unified Semi-Supervised Community Detection Framework Using Latent Space Graph Regularization , 2015, IEEE Transactions on Cybernetics.

[18]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[19]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Xinbo Gao,et al.  Semi-Supervised Nonnegative Matrix Factorization via Constraint Propagation , 2016, IEEE Transactions on Cybernetics.

[21]  Zhong-Yuan Zhang,et al.  Enhanced Community Structure Detection in Complex Networks with Partial Background Information , 2013, Scientific reports.

[22]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[23]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[24]  Sarah A Teichmann,et al.  The origins and evolution of functional modules: lessons from protein complexes , 2006, Philosophical Transactions of the Royal Society B: Biological Sciences.

[25]  Dao-Qing Dai,et al.  Detecting overlapping protein complexes based on a generative model with functional and topological properties , 2014, BMC Bioinformatics.

[26]  Fei Wang,et al.  Community discovery using nonnegative matrix factorization , 2011, Data Mining and Knowledge Discovery.

[27]  D. Bu,et al.  the protein–protein interaction network , 2004 .

[28]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[29]  Ignacio Marín,et al.  Iterative Cluster Analysis of Protein Interaction Data , 2005, Bioinform..

[30]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[31]  Albert-László Barabási,et al.  Error and attack tolerance of complex networks , 2000, Nature.

[32]  A. Barabasi,et al.  Uncovering disease-disease relationships through the incomplete interactome , 2015, Science.

[33]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[34]  Natasa Przulj,et al.  Topology-function conservation in protein–protein interaction networks , 2015, Bioinform..

[35]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[36]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.