Protein Complexes Prediction Method Based on Core—Attachment Structure and Functional Annotations

Recent advances in high-throughput laboratory techniques captured large-scale protein–protein interaction (PPI) data, making it possible to create a detailed map of protein interaction networks, and thus enable us to detect protein complexes from these PPI networks. However, most of the current state-of-the-art studies still have some problems, for instance, incapability of identifying overlapping clusters, without considering the inherent organization within protein complexes, and overlooking the biological meaning of complexes. Therefore, we present a novel overlapping protein complexes prediction method based on core–attachment structure and function annotations (CFOCM), which performs in two stages: first, it detects protein complex cores with the maximum value of our defined cluster closeness function, in which the proteins are also closely related to at least one common function. Then it appends attach proteins into these detected cores to form the returned complexes. For performance evaluation, CFOCM and six classical methods have been used to identify protein complexes on three different yeast PPI networks, and three sets of real complexes including the Munich Information Center for Protein Sequences (MIPS), the Saccharomyces Genome Database (SGD) and the Catalogues of Yeast protein Complexes (CYC2008) are selected as benchmark sets, and the results show that CFOCM is indeed effective and robust for achieving the highest F-measure values in all tests.

[1]  Franco J. Vizeacoumar,et al.  Interaction landscape of membrane-protein complexes in Saccharomyces cerevisiae , 2012, Nature.

[2]  Clara Pizzuti,et al.  A Coclustering Approach for Mining Large Protein-Protein Interaction Networks , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[4]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[5]  L. Wong,et al.  Methods for protein complex prediction and their contributions towards understanding the organisation, function and dynamics of complexes , 2015, FEBS letters.

[6]  Yves Moreau,et al.  Concordance of gene expression in human protein complexes reveals tissue specificity and pathology , 2013, Nucleic acids research.

[7]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[8]  Yi Pan,et al.  A Fast Hierarchical Clustering Algorithm for Functional Modules Discovery in Protein Interaction Networks , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[9]  Fang-Xiang Wu,et al.  Identifying protein complexes and functional modules - from static PPI networks to dynamic PPI networks , 2014, Briefings Bioinform..

[10]  Min Wu,et al.  A core-attachment based method to detect protein complexes in PPI networks , 2009, BMC Bioinformatics.

[11]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[12]  Hongfei Lin,et al.  Construction of Ontology Augmented Networks for Protein Complex Prediction , 2013, PloS one.

[13]  David Botstein,et al.  SGD: Saccharomyces Genome Database , 1998, Nucleic Acids Res..

[14]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[15]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[16]  Fang-Xiang Wu,et al.  Identifying Protein Complexes Based on Multiple Topological Structures in PPI Networks , 2013, IEEE Transactions on NanoBioscience.

[17]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[18]  Ignacio Marín,et al.  Jerarca: Efficient Analysis of Complex Networks Using Hierarchical Clustering , 2010, PloS one.

[19]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 1999, Nucleic Acids Res..

[20]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[21]  Keith C. C. Chan,et al.  Utilizing Both Topological and Attribute Information for Protein Complex Identification in PPI Networks , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[22]  S. Dongen Graph clustering by flow simulation , 2000 .

[23]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[24]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[25]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[26]  Yi Pan,et al.  Towards the identification of protein complexes and functional modules by integrating PPI network and gene expression data , 2012, BMC Bioinformatics.

[27]  Hon Wai Leong,et al.  A survey of computational methods for protein complex prediction from protein interaction networks , 2012, J. Bioinform. Comput. Biol..

[28]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[29]  B. Séraphin,et al.  A generic protein purification method for protein complex characterization and proteome exploration , 1999, Nature Biotechnology.

[30]  Xiaoli Li,et al.  Computational approaches for detecting protein complexes from protein interaction networks: a survey , 2010, BMC Genomics.

[31]  Keith C. C. Chan,et al.  Discovering Functional Interdependence Relationship in PPI Networks for Protein Complex Identification , 2012, IEEE Transactions on Biomedical Engineering.

[32]  Xiaomei Quan,et al.  Survey: Functional Module Detection from Protein-Protein Interaction Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[33]  Siu-Ming Yiu,et al.  Predicting Protein Complexes from PPI Data: A Core-Attachment Approach , 2009, J. Comput. Biol..

[34]  Le Ou-Yang,et al.  Protein Complex Detection via Weighted Ensemble Clustering Based on Bayesian Nonnegative Matrix Factorization , 2013, PloS one.