An edge based core-attachment method to detect protein complexes in PPI networks

Characterization and identification of protein complexes in protein-protein interaction (PPI) networks is important in understanding cellular processes. With the core-attachment concept, a novel core-attachment algorithm is proposed by characterizing the protein complex core from the perspective of edges. We reinvite a protein complex core to be a set of closely interrelated edges rather than a set of interrelated proteins. We first identify the edges must belong to a core, and then partition these edges to extract cores. After that, we select the attachments for each complex core to form a protein complex. Finally, we evaluate the performance of our algorithm by applying it on two different yeast PPI networks. The experimental results show that our algorithm outperforms the MCL, CPM, CoAch in terms of number of precisely predicted protein complexes, localization as well as GO semantic similarity. Our proposed method is validated as an effective algorithm in identifying protein complexes and can provide more insights for future biological study. It proves that edge community is a better topological characterization of protein complex.

[1]  Lani F. Wu,et al.  Large-scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters , 2002, Nature Genetics.

[2]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[3]  Siu-Ming Yiu,et al.  Predicting Protein Complexes from PPI Data: A Core-Attachment Approach , 2009, J. Comput. Biol..

[4]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[5]  Gary D Bader,et al.  A Combined Experimental and Computational Strategy to Define Protein Interaction Networks for Peptide Recognition Modules , 2001, Science.

[6]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[7]  Limsoon Wong,et al.  Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein-Protein Interactions , 2006, BioDM.

[8]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2006, Nucleic Acids Res..

[9]  Min Wu,et al.  A core-attachment based method to detect protein complexes in PPI networks , 2009, BMC Bioinformatics.

[10]  P. Bork,et al.  Structure-Based Assembly of Protein Complexes in Yeast , 2004, Science.

[11]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[12]  Huawei Shen,et al.  Quantifying and identifying the overlapping community structure in networks , 2009, 0905.2666.

[13]  S. vanDongen Graph Clustering by Flow Simulation , 2000 .

[14]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[15]  Huawei Shen,et al.  Bridgeness: a local index on edge significance in maintaining global connectivity , 2010, 1005.2652.

[16]  Caroline C. Friedel,et al.  Bootstrapping the Interactome: Unsupervised Identification of Protein Complexes in Yeast , 2008, RECOMB.

[17]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[18]  A. Barabasi,et al.  Functional and topological characterization of protein interaction networks , 2004, Proteomics.

[19]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[20]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..

[21]  Thomas Lengauer,et al.  A new measure for functional similarity of gene products based on Gene Ontology , 2006, BMC Bioinformatics.