Semantic and layered protein function prediction from PPI networks.

BACKGROUND The past few years have seen a rapid development in novel high-throughput technologies that have created large-scale data on protein-protein interactions (PPI) across human and most model species. This data is commonly represented as networks, with nodes representing proteins and edges representing the PPIs. A fundamental challenge to bioinformatics is how to interpret this wealth of data to elucidate the interaction of patterns and the biological characteristics of the proteins. One significant purpose of this interpretation is to predict unknown protein functions. Although many approaches have been proposed in recent years, the challenge still remains how to reasonably and precisely measure the functional similarities between proteins to improve the prediction effectiveness. RESULTS We used a Semantic and Layered Protein Function Prediction (SLPFP) framework to more effectively predict unknown protein functions at different functional levels. The framework relies on a new protein similarity measurement and a clustering-based protein function prediction algorithm. The new protein similarity measurement incorporates the topological structure of the PPI network, as well as the protein's semantic information in terms of known protein functions at different functional layers. Experiments on real PPI datasets were conducted to evaluate the effectiveness of the proposed framework in predicting unknown protein functions. CONCLUSION The proposed framework has a higher prediction accuracy compared with other similar approaches. The prediction results are stable even for a large number of proteins. Furthermore, the framework is able to predict unknown functions at different functional layers within the Munich Information Center for Protein Sequence (MIPS) hierarchical functional scheme. The experimental results demonstrated that the new protein similarity measurement reflects more reasonably and precisely relationships between proteins.

[1]  M. Vidal,et al.  Protein interaction mapping in C. elegans using proteins involved in vulval development. , 2000, Science.

[2]  Mona Singh,et al.  Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps , 2005, ISMB.

[3]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[4]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[5]  Stanley Letovsky,et al.  Predicting protein function from protein/protein interaction data: a probabilistic approach , 2003, ISMB.

[6]  Patrick Aloy,et al.  Ten thousand interactions for the molecular biologist , 2004, Nature Biotechnology.

[7]  Lawrence Hunter,et al.  Improving protein function prediction methods with integrated literature data , 2008, BMC Bioinformatics.

[8]  Qingfeng Chen,et al.  Mining frequent patterns for AMP-activated protein kinase regulation on skeletal muscle , 2006, BMC Bioinformatics.

[9]  Kui Zhang,et al.  Prediction of protein function using protein-protein interaction data , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[10]  Jinyan Li,et al.  A pHMM-ANN based discriminative approach to promoter identification in prokaryote genomic contexts , 2006, Nucleic acids research.

[11]  Stan Matwin,et al.  Functional Annotation of Genes Using Hierarchical Text Categorization , 2005 .

[12]  Satoru Miyano,et al.  Open source clustering software , 2004 .

[13]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[14]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[15]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[16]  M. Samanta,et al.  Predicting protein functions from redundancies in large-scale protein interaction networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[17]  H. Mewes,et al.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. , 2004, Nucleic acids research.

[18]  David Martin,et al.  Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network , 2003, Genome Biology.

[19]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[20]  J. Fetrow,et al.  Sequence- and structure-based protein function prediction from genomic information. , 2001, Current opinion in drug discovery & development.

[21]  P. Uetz,et al.  What do we learn from high-throughput protein interaction data? , 2004, Expert review of proteomics.

[22]  R. Nussinov,et al.  Protein–protein interactions: organization, cooperativity and mapping in a bottom-up Systems Biology approach , 2005, Physical biology.

[23]  J. Wojcik,et al.  The protein–protein interaction map of Helicobacter pylori , 2001, Nature.

[24]  Igor B Kuznetsov,et al.  Comparative computational analysis of prion proteins reveals two fragments with unusual structural properties and a pattern of increase in hydrophobicity associated with disease‐promoting mutations , 2004, Protein science : a publication of the Protein Society.

[25]  Limsoon Wong,et al.  Exploiting indirect neighbours and topological weight to predict protein function from protein--protein interactions , 2006 .

[26]  Alessandro Vespignani,et al.  Global protein function prediction from protein-protein interaction networks , 2003, Nature Biotechnology.