A Novel Protein Complex Identification Algorithm Based on the Integration of Local Network Topology and Gene Ontology

The identification of protein complexes is an essential step to understand the principles of cellular organization and biochemical phenomena. A large dataset of experimentally detected protein-protein interactions (PPI) has been determined using high-throughput experimental techniques. However, these datasets usually contain spurious interactions, which complicate the accurate identification of protein complexes by using computational methods. In this study, a novel method is developed to predict protein complexes based on PPI network topology and gene ontology. A protein functional similarity is performed to estimate the reliability of this interaction. A minimum cut-based method is used to detect protein complexes in the weight network. Experimental results show that our method performs better than several efficient, existing clustering algorithms.

[1]  P. Bork,et al.  Structure-Based Assembly of Protein Complexes in Yeast , 2004, Science.

[2]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[3]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[4]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[5]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[6]  Yijia Zhang,et al.  Identifying Protein Complexes from PPI Networks Using GO Semantic Similarity , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.

[7]  Zelmina Lubovac,et al.  Combining functional and topological properties to identify core modules in protein interaction networks , 2006, Proteins.

[8]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..

[9]  Björn Olsson,et al.  Weighted Cohesiveness for Identification of Functional Modules and Their Interconnectivity , 2007, BIRD.

[10]  Guimei Liu,et al.  Complex discovery from weighted PPI networks , 2009, Bioinform..

[11]  Anastasios Bezerianos,et al.  Growing functional modules from a seed protein via integration of protein interaction and gene expression data , 2007, BMC Bioinformatics.

[12]  Gang Chen,et al.  Modifying the DPClus algorithm for identifying protein complexes based on new topological structures , 2008, BMC Bioinformatics.

[13]  Siu-Ming Yiu,et al.  Predicting Protein Complexes from PPI Data: A Core-Attachment Approach , 2009, J. Comput. Biol..

[14]  Limsoon Wong,et al.  Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein-Protein Interactions , 2006, BioDM.

[16]  Lin Gao,et al.  Predicting protein complexes in protein interaction networks using a core-attachment algorithm based on graph communicability , 2012, Inf. Sci..

[17]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[18]  Chung-Yen Lin,et al.  A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles , 2010, BMC Bioinformatics.

[19]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2006, Nucleic Acids Res..

[20]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[21]  Nazar Zaki,et al.  Protein complex detection using interaction reliability assessment and weighted clustering coefficient , 2013, BMC Bioinformatics.

[22]  Haixuan Yang,et al.  Improving GO semantic similarity measures by exploring the ontology beneath the terms and modelling uncertainty , 2012, Bioinform..

[23]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[24]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[25]  Yi Pan,et al.  A Fast Hierarchical Clustering Algorithm for Functional Modules Discovery in Protein Interaction Networks , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[26]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[27]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[28]  Limsoon Wong,et al.  Using Indirect protein-protein Interactions for protein Complex Prediction , 2008, J. Bioinform. Comput. Biol..