Protein complex detection using interaction reliability assessment and weighted clustering coefficient

BackgroundPredicting protein complexes from protein-protein interaction data is becoming a fundamental problem in computational biology. The identification and characterization of protein complexes implicated are crucial to the understanding of the molecular events under normal and abnormal physiological conditions. On the other hand, large datasets of experimentally detected protein-protein interactions were determined using High-throughput experimental techniques. However, experimental data is usually liable to contain a large number of spurious interactions. Therefore, it is essential to validate these interactions before exploiting them to predict protein complexes.ResultsIn this paper, we propose a novel graph mining algorithm (PEWCC) to identify such protein complexes. Firstly, the algorithm assesses the reliability of the interaction data, then predicts protein complexes based on the concept of weighted clustering coefficient. To demonstrate the effectiveness of the proposed method, the performance of PEWCC was compared to several methods. PEWCC was able to detect more matched complexes than any of the state-of-the-art methods with higher quality scores.ConclusionsThe higher accuracy achieved by PEWCC in detecting protein complexes is a valid argument in favor of the proposed method. The datasets and programs are freely available at http://faculty.uaeu.ac.ae/nzaki/Research.htm.

[1]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[2]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[3]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[4]  Siu-Ming Yiu,et al.  Predicting Protein Complexes from PPI Data: A Core-Attachment Approach , 2009, J. Comput. Biol..

[5]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[6]  Limsoon Wong,et al.  Exploiting indirect neighbours and topological weight to predict protein function from protein--protein interactions , 2006 .

[7]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[8]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[9]  Alain Guénoche,et al.  Multifunctional proteins revealed by overlapping clustering in protein interaction network , 2011, Bioinform..

[10]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[11]  Guimei Liu,et al.  Complex discovery from weighted PPI networks , 2009, Bioinform..

[12]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[13]  Limsoon Wong,et al.  Using indirect protein-protein interactions for protein complex predication. , 2007, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[14]  Michalis Vazirgiannis,et al.  Noise reduction in protein-protein interaction graphs by the implementation of a novel weighting scheme , 2011, BMC Bioinformatics.

[15]  B. Séraphin,et al.  A generic protein purification method for protein complex characterization and proteome exploration , 1999, Nature Biotechnology.

[16]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2006, Nucleic Acids Res..

[17]  Ambuj K. Singh,et al.  RRW: repeated random walks on genome-scale protein networks for local cluster discovery , 2009, BMC Bioinformatics.

[18]  Limsoon Wong,et al.  Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein-Protein Interactions , 2006, BioDM.

[19]  Shmuel Sattath,et al.  How reliable are experimental protein-protein interaction data? , 2003, Journal of molecular biology.

[20]  S. Dongen Graph clustering by flow simulation , 2000 .

[21]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[22]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[23]  Xiaoli Li,et al.  Computational approaches for detecting protein complexes from protein interaction networks: a survey , 2010, BMC Genomics.

[24]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[25]  Dao-Qing Dai,et al.  Exploring Overlapping Functional Units with Various Structure in Protein Interaction Networks , 2012, PloS one.

[26]  Chee Keong Kwoh,et al.  Construction of co-complex score matrix for protein complex prediction from AP-MS data , 2011, Bioinform..

[27]  P. Bork,et al.  Structure-Based Assembly of Protein Complexes in Yeast , 2004, Science.

[28]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[29]  Young-Rae Cho,et al.  Accuracy improvement in protein complex prediction from protein interaction networks by refining cluster overlaps , 2012, Proteome Science.

[30]  Desmond J. Higham,et al.  Geometric De-noising of Protein-Protein Interaction Networks , 2009, PLoS Comput. Biol..

[31]  Nazar Zaki,et al.  Detecting protein complexes from noisy protein interaction data , 2012, BIOKDD '12.

[32]  David Martin,et al.  Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network , 2003, Genome Biology.

[33]  Igor Jurisica,et al.  Functional topology in a network of protein interactions , 2004, Bioinform..

[34]  Limsoon Wong,et al.  Using Indirect protein-protein Interactions for protein Complex Prediction , 2008, J. Bioinform. Comput. Biol..

[35]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Nazar Zaki,et al.  ProRank: a method for detecting protein complexes , 2012, GECCO '12.

[37]  N. Zaki,et al.  Detection of protein complexes using a protein ranking algorithm , 2012, Proteins.