Detecting Protein Complexes from Signed Protein-Protein Interaction Networks

Identification of protein complexes is fundamental for understanding the cellular functional organization. With the accumulation of physical protein-protein interaction (PPI) data, computational detection of protein complexes from available PPI networks has drawn a lot of attentions. While most of the existing protein complex detection algorithms focus on analyzing the physical protein-protein interaction network, none of them take into account the “signs” (i.e., activation-inhibition relationships) of physical interactions. As the “signs” of interactions reflect the way proteins communicate, considering the “signs” of interactions can not only increase the accuracy of protein complex identification, but also deepen our understanding of the mechanisms of cell functions. In this study, we proposed a novel Signed Graph regularized Nonnegative Matrix Factorization (SGNMF) model to identify protein complexes from signed PPI networks. In our experiments, we compared the results collected by our model on signed PPI networks with those predicted by the state-of-the-art complex detection techniques on the original unsigned PPI networks. We observed that considering the “signs” of interactions significantly benefits the detection of protein complexes. Furthermore, based on the predicted complexes, we predicted a set of signed complex-complex interactions for each dataset, which provides a novel insight of the higher level organization of the cell. All the experimental results and codes can be downloaded from http://mail.sysu.edu.cn/home/stsddq@mail. sysu.edu.cn/dai/others/SGNMF.zip.

[1]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[2]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[3]  Le Ou-Yang,et al.  Protein Complex Detection via Weighted Ensemble Clustering Based on Bayesian Nonnegative Matrix Factorization , 2013, PloS one.

[4]  Xiaolong Wang,et al.  Overlapping community detection in networks with positive and negative links , 2013, ArXiv.

[5]  Dao-Qing Dai,et al.  Protein Complexes Discovery Based on Protein-Protein Interaction Data via a Regularized Sparse Generative Network Model , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  Sahin Albayrak,et al.  Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization , 2010, SDM.

[7]  Chris H. Q. Ding,et al.  Symmetric Nonnegative Matrix Factorization for Graph Clustering , 2012, SDM.

[8]  Yanhui Hu,et al.  Integrating protein-protein interaction networks with phenotypes reveals signs of interactions , 2013, Nature Methods.

[9]  A. Kudlicki,et al.  Logic of the Yeast Metabolic Cycle: Temporal Compartmentalization of Cellular Processes , 2005, Science.

[10]  Dariusz Plewczynski,et al.  Protein-protein interaction and pathway databases, a graphical review , 2011, Briefings Bioinform..

[11]  N. Perrimon,et al.  Protein Complex–Based Analysis Framework for High-Throughput Data Sets , 2013, Science Signaling.

[12]  B. Snel,et al.  Predicting disease genes using protein–protein interactions , 2006, Journal of Medical Genetics.

[13]  Xiaoli Li,et al.  Computational approaches for detecting protein complexes from protein interaction networks: a survey , 2010, BMC Genomics.

[14]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[15]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[16]  M. Lei The MCM complex: its role in DNA replication and implications for cancer therapy. , 2005, Current cancer drug targets.

[17]  Sampsa Hautaniemi,et al.  Fast Gene Ontology based clustering for microarray experiments , 2008, BioData Mining.

[18]  Xiaomei Quan,et al.  Survey: Functional Module Detection from Protein-Protein Interaction Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[19]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[20]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[21]  G. Montoya,et al.  The human GINS complex associates with Cdc45 and MCM and is essential for DNA replication , 2009, Nucleic acids research.

[22]  Haiyuan Yu,et al.  Genome-scale analysis of interaction dynamics reveals organization of biological networks , 2012, Bioinform..

[23]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2006, Nucleic Acids Res..

[24]  L. Johnson The regulation of protein phosphorylation. , 2009, Biochemical Society transactions.

[25]  Yi Pan,et al.  Towards the identification of protein complexes and functional modules by integrating PPI network and gene expression data , 2011, BMC Bioinformatics.

[26]  Peng Jiang,et al.  SPICi: a fast clustering algorithm for large biological networks , 2010, Bioinform..

[27]  Eivind Hovig,et al.  From proteomes to complexomes in the era of systems biology , 2014, Proteomics.

[28]  J. Archambault,et al.  Genetics of eukaryotic RNA polymerases I, II, and III. , 1993, Microbiological reviews.

[29]  Roded Sharan,et al.  Associating Genes and Protein Complexes with Disease via Network Propagation , 2010, PLoS Comput. Biol..

[30]  Christie S. Chang,et al.  The BioGRID interaction database: 2013 update , 2012, Nucleic Acids Res..

[31]  Mona Singh,et al.  Simple Topological Features Reflect Dynamics and Modularity in Protein Interaction Networks , 2013, PLoS Comput. Biol..

[32]  Samuel Kaski,et al.  Searching for functional gene modules with interaction component models , 2009, BMC Systems Biology.

[33]  Kahn Rhrissorrakrai,et al.  MINE: Module Identification in Networks , 2011, BMC Bioinformatics.

[34]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[36]  G G Koch,et al.  An overview of statistical issues and methods of meta-analysis. , 1991, Journal of biopharmaceutical statistics.

[37]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Yijie Wang,et al.  Functional module identification in protein interaction networks by interaction patterns , 2014, Bioinform..

[39]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[40]  Hon Wai Leong,et al.  A survey of computational methods for protein complex prediction from protein interaction networks , 2012, J. Bioinform. Comput. Biol..

[41]  David Botstein,et al.  SGD: Saccharomyces Genome Database , 1998, Nucleic Acids Res..

[42]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[43]  Chris H. Q. Ding,et al.  On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering , 2005, SDM.

[44]  Ron Shamir,et al.  Constructing module maps for integrated analysis of heterogeneous biological networks , 2014, Nucleic acids research.

[45]  Daphne Koller,et al.  A Complex-based Reconstruction of the Saccharomyces cerevisiae Interactome *S⃞ , 2009, Molecular & Cellular Proteomics.

[46]  Kai Xu,et al.  Visualization and analysis of the complexome network of Saccharomyces cerevisiae. , 2011, Journal of proteome research.

[47]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[48]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[49]  Dao-Qing Dai,et al.  Exploring Overlapping Functional Units with Various Structure in Protein Interaction Networks , 2012, PloS one.

[50]  Chee Keong Kwoh,et al.  Construction of co-complex score matrix for protein complex prediction from AP-MS data , 2011, Bioinform..

[51]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[52]  K. Kamada The GINS complex: structure and function. , 2012, Sub-cellular biochemistry.

[53]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.