Discovering protein complexes in protein interaction networks via exploring the weak ties effect

BackgroundStudying protein complexes is very important in biological processes since it helps reveal the structure-functionality relationships in biological networks and much attention has been paid to accurately predict protein complexes from the increasing amount of protein-protein interaction (PPI) data. Most of the available algorithms are based on the assumption that dense subgraphs correspond to complexes, failing to take into account the inherence organization within protein complex and the roles of edges. Thus, there is a critical need to investigate the possibility of discovering protein complexes using the topological information hidden in edges.ResultsTo provide an investigation of the roles of edges in PPI networks, we show that the edges connecting less similar vertices in topology are more significant in maintaining the global connectivity, indicating the weak ties phenomenon in PPI networks. We further demonstrate that there is a negative relation between the weak tie strength and the topological similarity. By using the bridges, a reliable virtual network is constructed, in which each maximal clique corresponds to the core of a complex. By this notion, the detection of the protein complexes is transformed into a classic all-clique problem. A novel core-attachment based method is developed, which detects the cores and attachments, respectively. A comprehensive comparison among the existing algorithms and our algorithm has been made by comparing the predicted complexes against benchmark complexes.ConclusionsWe proved that the weak tie effect exists in the PPI network and demonstrated that the density is insufficient to characterize the topological structure of protein complexes. Furthermore, the experimental results on the yeast PPI network show that the proposed method outperforms the state-of-the-art algorithms. The analysis of detected modules by the present algorithm suggests that most of these modules have well biological significance in context of complexes, suggesting that the roles of edges are critical in discovering protein complexes.

[1]  Tao Zhou,et al.  Link prediction in weighted networks: The role of weak ties , 2010 .

[2]  Caroline C. Friedel,et al.  Bootstrapping the Interactome: Unsupervised Identification of Protein Complexes in Yeast , 2008, RECOMB.

[3]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[4]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[5]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[6]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[7]  P. Bork,et al.  Structure-Based Assembly of Protein Complexes in Yeast , 2004, Science.

[8]  Chris H. Q. Ding,et al.  Determining modular organization of protein interaction networks by maximizing modularity density , 2010, BMC Systems Biology.

[9]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..

[10]  Limsoon Wong,et al.  Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein-Protein Interactions , 2006, BioDM.

[11]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[12]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[13]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[14]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[15]  Panos M. Pardalos,et al.  The maximum clique problem , 1994, J. Glob. Optim..

[16]  Fei Luo,et al.  Discovering conditional co-regulated protein complexes by integrating diverse data sources , 2010, BMC Systems Biology.

[17]  P. Csermely Strong links are important, but weak links stabilize them. , 2004, Trends in biochemical sciences.

[18]  C. Landry,et al.  An in Vivo Map of the Yeast Protein Interactome , 2008, Science.

[19]  Siu-Ming Yiu,et al.  Predicting Protein Complexes from PPI Data: A Core-Attachment Approach , 2009, J. Comput. Biol..

[20]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[21]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2005, Nucleic Acids Res..

[22]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[23]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[24]  Gary D Bader,et al.  A Combined Experimental and Computational Strategy to Define Protein Interaction Networks for Peptide Recognition Modules , 2001, Science.

[25]  Morroe Berger,et al.  Freedom and control in modern society , 1954 .

[26]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[27]  A-L Barabási,et al.  Structure and tie strengths in mobile communication networks , 2006, Proceedings of the National Academy of Sciences.

[28]  Dong-Soo Han,et al.  Protein complex prediction based on mutually exclusive interactions in protein interaction network. , 2008, Genome informatics. International Conference on Genome Informatics.

[29]  Limsoon Wong,et al.  Protein complex prediction based on k-connected subgraphs in protein interaction network , 2010, BMC Systems Biology.

[30]  Limsoon Wong,et al.  Using indirect protein-protein interactions for protein complex predication. , 2007, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[31]  Zhi-Ping Liu,et al.  Identifying dysfunctional crosstalk of pathways in various regions of Alzheimer's disease brains , 2010, BMC Systems Biology.

[32]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[33]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[34]  Kei-Hoi Cheung,et al.  Large-scale analysis of the yeast genome by transposon tagging and gene disruption , 1999, Nature.

[35]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[36]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[37]  Xiaoli Li,et al.  Computational approaches for detecting protein complexes from protein interaction networks: a survey , 2010, BMC Genomics.

[38]  Anton J. Enright,et al.  Detection of functional modules from protein interaction networks , 2003, Proteins.

[39]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[40]  S. Redner,et al.  Introduction To Percolation Theory , 2018 .

[41]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[42]  Limsoon Wong,et al.  Using Indirect protein-protein Interactions for protein Complex Prediction , 2008, J. Bioinform. Comput. Biol..

[43]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[44]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[46]  Ronald W. Davis,et al.  Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. , 1999, Science.

[47]  See-Kiong Ng,et al.  Discovering protein complexes in dense reliable neighborhoods of protein interaction networks. , 2007, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[48]  Min Wu,et al.  A core-attachment based method to detect protein complexes in PPI networks , 2009, BMC Bioinformatics.

[49]  Huawei Shen,et al.  Bridgeness: a local index on edge significance in maintaining global connectivity , 2010, 1005.2652.

[50]  Neal S. Holter,et al.  Dynamic modeling of gene expression data. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[51]  Lin Gao,et al.  Predicting protein complexes in protein interaction networks using a core-attachment algorithm based on graph communicability , 2012, Inf. Sci..

[52]  Christian Schönbach,et al.  Molecular Biology of Protein-Protein Interactions for Computer Scientists , 2009 .

[53]  Yoshihide Hayashizaki,et al.  Interaction generality, a measurement to assess the reliability of a protein-protein interaction. , 2002, Nucleic acids research.

[54]  M. Gerstein,et al.  Subcellular localization of the yeast proteome. , 2002, Genes & development.

[55]  Nagiza F. Samatova,et al.  From pull-down data to protein interaction networks and complexes with biological relevance. , 2008, Bioinformatics.

[56]  Linyuan Lu,et al.  Link prediction based on local random walk , 2010, 1001.2467.