Complex Detection Based on Integrated Properties

Most of current methods mainly focus on topological information and fail to consider the information from protein primary sequence which is of considerable importance for protein complex detection. Based on this observation, we propose a novel algorithm called CDIP (Complex Detection based on Integrated Properties) to discover protein complexes from the yeast PPI network. In our method, a simple feature representation from protein primary sequence is presented and become a novel part of feature properties. The algorithm can consider both topological and biological information (amino acid background frequency), which is helpful to detect protein complex more efficiently. The experiments conducted on two public datasets show that the proposed algorithm outperforms the two state-of-the-art protein complex detection algorithms.

[1]  Yanjun Qi,et al.  Protein complex identification by supervised graph local clustering , 2008, ISMB.

[2]  Anastasios Bezerianos,et al.  Growing functional modules from a seed protein via integration of protein interaction and gene expression data , 2007, BMC Bioinformatics.

[3]  Siu-Ming Yiu,et al.  Predicting Protein Complexes from PPI Data: A Core-Attachment Approach , 2009, J. Comput. Biol..

[4]  S. Dongen Graph clustering by flow simulation , 2000 .

[5]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[6]  Min Wu,et al.  A core-attachment based method to detect protein complexes in PPI networks , 2009, BMC Bioinformatics.

[7]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[8]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[9]  Gang Chen,et al.  Modifying the DPClus algorithm for identifying protein complexes based on new topological structures , 2008, BMC Bioinformatics.

[10]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[11]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of genome information in 2007 , 2007, Nucleic Acids Res..

[12]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[13]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[14]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[15]  Juwen Shen,et al.  Predicting protein–protein interactions based only on sequences information , 2007, Proceedings of the National Academy of Sciences.