Prediction of Protein-Protein Interactions Using Protein Signature Profiling

Protein domains are conserved and functionally independent structures that play an important role in interactions among related proteins. Domain-domain interactions have been recently used to predict protein-protein interactions (PPI). In general, the interaction probability of a pair of domains is scored using a trained scoring function. Satisfying a threshold, the protein pairs carrying those domains are regarded as “interacting”. In this study, the signature contents of proteins were utilized to predict PPI pairs in Saccharomyces cerevisiae, Caenorhabditis elegans, and Homo sapiens. Similarity between protein signature patterns was scored and PPI predictions were drawn based on the binary similarity scoring function. Results show that the true positive rate of prediction by the proposed approach is approximately 32% higher than that using the maximum likelihood estimation method when compared with a test set, resulting in 22% increase in the area under the receiver operating characteristic (ROC) curve. When proteins containing one or two signatures were removed, the sensitivity of the predicted PPI pairs increased significantly. The predicted PPI pairs are on average 11 times more likely to interact than the random selection at a confidence level of 0.95, and on average 4 times better than those predicted by either phylogenetic profiling or gene expression profiling.

[1]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[2]  Peter Uetz,et al.  Protein interaction maps on the fly , 2004, Nature Biotechnology.

[3]  Hanno Steen,et al.  Development of human protein reference database as an initial platform for approaching systems biology in humans. , 2003, Genome research.

[4]  Jérôme Wojcik,et al.  Protein-protein interaction map inference using interacting domain profile pairs , 2001, ISMB.

[5]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[6]  Arun K. Ramani,et al.  Protein interaction networks from yeast to human. , 2004, Current opinion in structural biology.

[7]  Arvind K. Bansal,et al.  An automated comparative analysis of 17 complete microbial genomes , 1999, Bioinform..

[8]  A. E. Hirsh,et al.  Coevolution of gene expression among interacting proteins , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Ian M. Donaldson,et al.  The Biomolecular Interaction Network Database and related tools 2005 update , 2004, Nucleic Acids Res..

[10]  C. Ponting,et al.  The natural history of protein domains. , 2002, Annual review of biophysics and biomolecular structure.

[11]  B. Snel,et al.  Predicting gene function by conserved co-expression. , 2003, Trends in genetics : TIG.

[12]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[13]  Jong H. Park,et al.  Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the PDB and yeast. , 2001, Journal of molecular biology.

[14]  Kiyoshi Asai,et al.  Accurate extraction of functional associations between proteins based on common interaction partners and common domains , 2005, Bioinform..

[15]  Christopher J. Lee,et al.  Inferring protein domain interactions from databases of interacting proteins , 2005, Genome Biology.

[16]  Arun K. Ramani,et al.  Exploiting the co-evolution of interacting proteins to discover interaction specificity. , 2003, Journal of molecular biology.

[17]  Wan Kyu Kim,et al.  Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair. , 2002, Genome informatics. International Conference on Genome Informatics.

[18]  Nianjun Liu,et al.  Inferring protein-protein interactions through high-throughput interaction data from diverse organisms , 2005, Bioinform..

[19]  Tatsuya Akutsu,et al.  A simple method for inferring strengths of protein-protein interactions. , 2004, Genome informatics. International Conference on Genome Informatics.

[20]  E. Sprinzak,et al.  Correlated sequence-signatures as markers of protein-protein interaction. , 2001, Journal of molecular biology.

[21]  Ron Shamir,et al.  EXPANDER – an integrative program suite for microarray data analysis , 2005, BMC Bioinformatics.

[22]  V. Rao Vemuri,et al.  Intrusion Detection Using Text Processing Techniques with a Binary-Weighted Cosine Metric , 2006 .

[23]  T. Barrette,et al.  Probabilistic model of the human protein-protein interaction network , 2005, Nature Biotechnology.

[24]  Hong-Soog Kim,et al.  PreSPI: design and implementation of protein-protein interaction prediction service system. , 2004, Genome informatics. International Conference on Genome Informatics.

[25]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[26]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Jean-Loup Faulon,et al.  Predicting protein-protein interactions using signature products , 2005, Bioinform..

[28]  Shoshana J. Wodak,et al.  CYGD: the Comprehensive Yeast Genome Database , 2004, Nucleic Acids Res..

[29]  J. Wojcik,et al.  The protein–protein interaction map of Helicobacter pylori , 2001, Nature.

[30]  Ting Chen,et al.  An integrated approach to the prediction of domain-domain interactions , 2006, BMC Bioinformatics.

[31]  Amos Bairoch,et al.  The PROSITE database , 2005, Nucleic Acids Res..

[32]  See-Kiong Ng,et al.  InterDom: a database of putative interacting protein domains for validating predicted protein interactions and complexes , 2003, Nucleic Acids Res..

[33]  S. Hubbard,et al.  Conservation of orientation and sequence in protein domain--domain interactions. , 2005, Journal of molecular biology.

[34]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[35]  Gavin Sherlock,et al.  The Stanford Microarray Database accommodates additional microarray platforms and data formats , 2004, Nucleic Acids Res..