Conserved network motifs allow protein-protein interaction prediction

MOTIVATION High-throughput protein interaction detection methods are strongly affected by false positive and false negative results. Focused experiments are needed to complement the large-scale methods by validating previously detected interactions but it is often difficult to decide which proteins to probe as interaction partners. Developing reliable computational methods assisting this decision process is a pressing need in bioinformatics. RESULTS We show that we can use the conserved properties of the protein network to identify and validate interaction candidates. We apply a number of machine learning algorithms to the protein connectivity information and achieve a surprisingly good overall performance in predicting interacting proteins. Using a 'leave-one-out' approach we find average success rates between 20 and 40% for predicting the correct interaction partner of a protein. We demonstrate that the success of these methods is based on the presence of conserved interaction motifs within the network. AVAILABILITY A reference implementation and a table with candidate interacting partners for each yeast protein are available at http://www.protsuggest.org.

[1]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[2]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[3]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[4]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[5]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[6]  Sean M. McNee,et al.  On the recommending of citations for research papers , 2002, CSCW '02.

[7]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[8]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[9]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[10]  Kenji Satou,et al.  Extraction of knowledge on protein-protein interaction by association rule discovery , 2002, Bioinform..

[11]  C. Deane,et al.  Protein Interactions , 2002, Molecular & Cellular Proteomics.

[12]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[13]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[14]  Z N Oltvai,et al.  Evolutionary conservation of motif constituents in the yeast protein interaction network , 2003, Nature Genetics.

[15]  G. Caldarelli,et al.  Cycles structure and local ordering in complex networks , 2002, cond-mat/0212026.

[16]  Stanley Letovsky,et al.  Predicting protein function from protein/protein interaction data: a probabilistic approach , 2003, ISMB.

[17]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[18]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[19]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[20]  A. Barabasi,et al.  Functional and topological characterization of protein interaction networks , 2004, Proteomics.

[21]  T. Takagi,et al.  Prediction of protein-protein interaction sites using support vector machines. , 2004, Protein engineering, design & selection : PEDS.

[22]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[23]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[24]  S. Shen-Orr,et al.  Superfamilies of Evolved and Designed Networks , 2004, Science.

[25]  Ariel Fernández,et al.  The nonconserved wrapping of conserved protein folds reveals a trend toward increasing connectivity in proteomic networks. , 2004, Proceedings of the National Academy of Sciences of the United States of America.