A method to integrate, assess and characterize the protein-protein interactions

Recently, large-scale protein-protein interactions were recovered using the similar two-hybrid system for the model systems. This information allows us to investigate the protein interaction network from a systematic point of view. However, experimentally determined interactions are susceptible to errors. A previous assessment estimated that only ~10% of the interactions can be supported by more than one independent experiment, and about half of the interactions may be false positives. These false positives might unnecessarily link unrelated proteins, resulting in huge apparent interaction clusters, which complicate elucidation for the biological importance of these interactions. Address this problem, we present an approach to integrate, assess and characterize all available protein-protein interactions in model organisms yeast and fly. We first integrate all available protein-protein interaction databases of yeast and fly, and merge all the datasets. We then use machine learning techniques to score the reliability for each interaction, and to rigorously validate the scoring scheme of yeast protein-protein interactions from different aspects. Our results show that this scoring scheme provides a good basis for selecting reliable protein-protein interaction dataset

[1]  Ian M. Donaldson,et al.  BIND: the Biomolecular Interaction Network Database , 2001, Nucleic Acids Res..

[2]  Uri Alon,et al.  Kashtan, N., Itzkovitz, S., Milo, R. & Alon, U. Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20, 1746-1758 , 2004 .

[3]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[4]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[5]  Uri Alon,et al.  Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs , 2004, Bioinform..

[6]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[7]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[8]  M. Tyers,et al.  The GRID: The General Repository for Interaction Datasets , 2003, Genome Biology.

[9]  Vipin Kumar,et al.  Hmetis: a hypergraph partitioning package , 1998 .

[10]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[11]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2005, Nucleic Acids Res..

[13]  Christian von Mering,et al.  STRING: a database of predicted functional associations between proteins , 2003, Nucleic Acids Res..

[14]  Anton J. Enright,et al.  Detection of functional modules from protein interaction networks , 2003, Proteins.

[15]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[16]  S. Harrison,et al.  Structural rearrangements in the membrane penetration protein of a non-enveloped virus , 2004, Nature.