Predicting protein functions with message passing algorithms

MOTIVATION In the last few years, a growing interest in biology has been shifting toward the problem of optimal information extraction from the huge amount of data generated via large-scale and high-throughput techniques. One of the most relevant issues has recently emerged that of correctly and reliably predicting the functions of a given protein with that of functions exploiting information coming from the whole network of proteins physically interacting with the functionally undetermined one. In the present work, we will refer to an 'observed' protein as the one present in the protein-protein interaction networks published in the literature. METHODS The method proposed in this paper is based on a message passing algorithm known as Belief Propagation, which accepts the network of protein's physical interactions and a catalog of known protein's functions as input, and returns the probabilities for each unclassified protein of having one chosen function. The implementation of the algorithm allows for fast online analysis, and can easily be generalized into more complex graph topologies taking into account hypergraphs, i.e. complexes of more than two interacting proteins. RESULTS Benchmarks of our method are the two Saccharomyces cerevisiae protein-protein interaction networks and the Database of Interacting Proteins. The validity of our approach is successfully tested against other available techniques. CONTACT leone@isiosf.isi.it SUPPLEMENTARY INFORMATION http://isiosf.isi.it/~pagnani

[1]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[2]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[3]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[4]  T. Ito,et al.  Toward a protein-protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[5]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[6]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[7]  A. Barabasi,et al.  Functional and topological characterization of protein interaction networks , 2004, Proteomics.

[8]  Tommi S. Jaakkola,et al.  Physical network models and multi-source data integration , 2003, RECOMB '03.

[9]  Kui Zhang,et al.  Prediction of protein function using protein-protein interaction data , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[10]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Charlie Hodgman,et al.  A historical perspective on gene/protein functional assignment , 2000, Bioinform..

[12]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[13]  Stanley Letovsky,et al.  Predicting protein function from protein/protein interaction data: a probabilistic approach , 2003, ISMB.

[14]  Alessandro Vespignani,et al.  Global protein function prediction from protein-protein interaction networks , 2003, Nature Biotechnology.

[15]  L. Holm,et al.  Unraveling protein interaction networks with near-optimal efficiency , 2004, Nature Biotechnology.

[16]  M. Samanta,et al.  Predicting protein functions from redundancies in large-scale protein interaction networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[18]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[19]  T. Takagi,et al.  Assessment of prediction accuracy of protein function from protein–protein interaction data , 2001, Yeast.

[20]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[21]  M. Mézard,et al.  The Bethe lattice spin glass revisited , 2000, cond-mat/0009418.

[22]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[23]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.