Protein Complex Prediction in Large Ontology Attributed Protein-Protein Interaction Networks

Protein complexes are important for unraveling the secrets of cellular organization and function. Many computational approaches have been developed to predict protein complexes in protein-protein interaction (PPI) networks. However, most existing approaches focus mainly on the topological structure of PPI networks, and largely ignore the gene ontology (GO) annotation information. In this paper, we constructed ontology attributed PPI networks with PPI data and GO resource. After constructing ontology attributed networks, we proposed a novel approach called CSO (clustering based on network structure and ontology attribute similarity). Structural information and GO attribute information are complementary in ontology attributed networks. CSO can effectively take advantage of the correlation between frequent GO annotation sets and the dense subgraph for protein complex prediction. Our proposed CSO approach was applied to four different yeast PPI data sets and predicted many well-known protein complexes. The experimental results showed that CSO was valuable in predicting protein complexes and achieved state-of-the-art performance.

[1]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[2]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[3]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[4]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[5]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[6]  Mohammed J. Zaki,et al.  Mining Attribute-structure Correlated Patterns in Large Attributed Graphs , 2012, Proc. VLDB Endow..

[7]  C. Landry,et al.  An in Vivo Map of the Yeast Protein Interactome , 2008, Science.

[8]  Yanjun Qi,et al.  Protein complex identification by supervised graph local clustering , 2008, ISMB.

[9]  Guimei Liu,et al.  Complex discovery from weighted PPI networks , 2009, Bioinform..

[10]  Hon Wai Leong,et al.  MCL-CAw: a refinement of MCL for detecting yeast complexes from weighted PPI networks by incorporating core-attachment structure , 2010, BMC Bioinformatics.

[11]  S. Dongen Graph clustering by flow simulation , 2000 .

[12]  See-Kiong Ng,et al.  Interaction graph mining for protein complexes using local clique merging. , 2005, Genome informatics. International Conference on Genome Informatics.

[13]  Anton J. Enright,et al.  Detection of functional modules from protein interaction networks , 2003, Proteins.

[14]  Jinyan Li,et al.  Assessing and predicting protein interactions using both local and global network topological metrics. , 2008, Genome informatics. International Conference on Genome Informatics.

[15]  Akira Tanaka,et al.  The worst-case time complexity for generating all maximal cliques and computational experiments , 2006, Theor. Comput. Sci..

[16]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[17]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[18]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[19]  Mark E. Davis,et al.  Insights into the kinetics of siRNA-mediated gene silencing from live-cell and live-animal bioluminescent imaging , 2006, Nucleic acids research.

[20]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[21]  Xiaoli Li,et al.  Computational approaches for detecting protein complexes from protein interaction networks: a survey , 2010, BMC Genomics.

[22]  Gang Chen,et al.  Modifying the DPClus algorithm for identifying protein complexes based on new topological structures , 2008, BMC Bioinformatics.

[23]  Dong-Soo Han,et al.  Protein complex prediction based on simultaneous protein interaction network , 2010, Bioinform..

[24]  Chung-Yen Lin,et al.  A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles , 2010, BMC Bioinformatics.

[25]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[26]  A. Mukhopadhyay,et al.  Detecting protein complexes in a PPI network: a gene ontology based multi-objective evolutionary approach. , 2012, Molecular bioSystems.

[27]  B. Séraphin,et al.  A generic protein purification method for protein complex characterization and proteome exploration , 1999, Nature Biotechnology.

[28]  Siu-Ming Yiu,et al.  Predicting Protein Complexes from PPI Data: A Core-Attachment Approach , 2009, J. Comput. Biol..

[29]  P. Cramer,et al.  Transcription Mechanism Architecture of RNA Polymerase II and Implications , 2010 .

[30]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[31]  Gene Ontology Consortium,et al.  The Gene Ontology (GO) project in 2006 , 2005, Nucleic Acids Res..

[32]  See-Kiong Ng,et al.  Discovering protein complexes in dense reliable neighborhoods of protein interaction networks. , 2007, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[33]  Min Wu,et al.  A core-attachment based method to detect protein complexes in PPI networks , 2009, BMC Bioinformatics.

[34]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[35]  Tao Jiang,et al.  A max-flow based approach to the identification of protein complexes using protein interaction and microarray data. , 2008, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[36]  Limsoon Wong,et al.  Using Indirect protein-protein Interactions for protein Complex Prediction , 2008, J. Bioinform. Comput. Biol..

[37]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[38]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..