A domain-based approach to predict protein-protein interactions

BackgroundKnowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level. The determination of the protein-protein interaction (PPI) networks has been the subject of extensive research. Despite the development of reasonably successful methods, serious technical difficulties still exist. In this paper we present DomainGA, a quantitative computational approach that uses the information about the domain-domain interactions to predict the interactions between proteins.ResultsDomainGA is a multi-parameter optimization method in which the available PPI information is used to derive a quantitative scoring scheme for the domain-domain pairs. Obtained domain interaction scores are then used to predict whether a pair of proteins interacts. Using the yeast PPI data and a series of tests, we show the robustness and insensitivity of the DomainGA method to the selection of the parameter sets, score ranges, and detection rules. Our DomainGA method achieves very high explanation ratios for the positive and negative PPIs in yeast. Based on our cross-verification tests on human PPIs, comparison of the optimized scores with the structurally observed domain interactions obtained from the iPFAM database, and sensitivity and specificity analysis; we conclude that our DomainGA method shows great promise to be applicable across multiple organisms.ConclusionWe envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs. As it is based on fundamental structural information, the DomainGA approach can be used to create potential PPIs and the accuracy of the constructed interaction template can be further improved using complementary methods. Explanation ratios obtained in the reported test case studies clearly show that the false prediction rates of the template networks constructed using the DomainGA scores are reasonably low, and the erroneous predictions can be filtered further using supplementary approaches such as those based on literature search or other prediction methods.

[1]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[2]  Shmuel Sattath,et al.  How reliable are experimental protein-protein interaction data? , 2003, Journal of molecular biology.

[3]  Andrey Rzhetsky,et al.  Towards the Prediction of Complete Protein-Protein Interaction Networks , 2001, Pacific Symposium on Biocomputing.

[4]  Hans-Werner Mewes,et al.  MPact: the MIPS protein interaction resource on yeast , 2005, Nucleic Acids Res..

[5]  Joel S. Bader,et al.  Greedily building protein networks with confidence , 2003, Bioinform..

[6]  David A. Gough,et al.  Predicting protein-protein interactions from primary structure , 2001, Bioinform..

[7]  J M Gauthier,et al.  Protein--protein interaction maps: a lead towards cellular functions. , 2001, Trends in genetics : TIG.

[8]  Jong H. Park,et al.  Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the PDB and yeast. , 2001, Journal of molecular biology.

[9]  See-Kiong Ng,et al.  Integrative approach for computationally inferring protein domain interactions , 2003, SAC '03.

[10]  Yi Xing,et al.  Assessing the impact of alternative splicing on domain interactions in the human proteome. , 2004, Journal of proteome research.

[11]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[12]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[13]  A. Valencia,et al.  Protein interaction: same network, different hubs. , 2003, Trends in genetics : TIG.

[14]  Hiroaki Kitano,et al.  The PANTHER database of protein families, subfamilies, functions and pathways , 2004, Nucleic Acids Res..

[15]  Amos Bairoch,et al.  PROSITE: A Documented Database Using Patterns and Profiles as Motif Descriptors , 2002, Briefings Bioinform..

[16]  Jérôme Wojcik,et al.  Protein-protein interaction map inference using interacting domain profile pairs , 2001, ISMB.

[17]  C. Deane,et al.  Protein Interactions , 2002, Molecular & Cellular Proteomics.

[18]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[19]  See-Kiong Ng,et al.  InterDom: a database of putative interacting protein domains for validating predicted protein interactions and complexes , 2003, Nucleic Acids Res..

[20]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[21]  Giorgio Palù,et al.  Disruption of protein–protein interactions: Towards new targets for chemotherapy , 2005, Journal of cellular physiology.

[22]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[23]  Wan Kyu Kim,et al.  Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair. , 2002, Genome informatics. International Conference on Genome Informatics.

[24]  Ting Chen,et al.  An integrated approach to the prediction of domain-domain interactions , 2006, BMC Bioinformatics.

[25]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Christopher J. Lee,et al.  Inferring protein domain interactions from databases of interacting proteins , 2005, Genome Biology.

[27]  Li Liao,et al.  Combining Pairwise Sequence Similarity and Support Vector Machines for Detecting Remote Protein Evolutionary and Structural Relationships , 2003, J. Comput. Biol..

[28]  Alex Bateman,et al.  InterPro: An Integrated Documentation Resource for Protein Families, Domains and Functional Sites , 2002, Briefings Bioinform..

[29]  K. Guimaraes,et al.  Predicting domain-domain interactions using a parsimony approach , 2006, Genome Biology.

[30]  R. Zutshi,et al.  Inhibiting the assembly of protein-protein interfaces. , 1998, Current opinion in chemical biology.

[31]  S. Wuchty Topology and weights in a protein domain interaction network – a novel way to predict protein interactions , 2006, BMC Genomics.

[32]  Tom M. W. Nye,et al.  Statistical analysis of domains in interacting protein pairs , 2005, Bioinform..

[33]  J. Moult,et al.  SNPs, protein structure, and disease , 2001, Human mutation.

[34]  Cathy H. Wu,et al.  InterPro, progress and status in 2005 , 2004, Nucleic Acids Res..

[35]  Raja Jothi,et al.  Co-evolutionary analysis of domains in interacting proteins reveals insights into domain-domain interactions mediating protein-protein interactions. , 2006, Journal of molecular biology.

[36]  E. Sprinzak,et al.  Correlated sequence-signatures as markers of protein-protein interaction. , 2001, Journal of molecular biology.

[37]  Jean-Loup Faulon,et al.  Predicting protein-protein interactions using signature products , 2005, Bioinform..

[38]  Peer Bork,et al.  Predicting protein cellular localization using a domain projection method. , 2002, Genome research.

[39]  Frederick P. Roth,et al.  Predicting co-complexed protein pairs using genomic and proteomic data integration , 2004, BMC Bioinformatics.

[40]  Terri K. Attwood,et al.  PRINTS and its automatic supplement, prePRINTS , 2003, Nucleic Acids Res..

[41]  Baldomero Oliva,et al.  Prediction of protein-protein interactions using distant conservation of sequence patterns and structure relationships , 2005, Bioinform..

[42]  G. Sumara,et al.  A Probabilistic Functional Network of Yeast Genes , 2004 .

[43]  Robert Hoffmann,et al.  HomoMINT: an inferred human network based on orthology mapping of protein interactions discovered in model organisms , 2005, BMC Bioinformatics.

[44]  Gavin MacBeath,et al.  A quantitative protein interaction network for the ErbB receptors using protein microarrays , 2006, Nature.

[45]  H. Wolfson,et al.  A new, structurally nonredundant, diverse data set of protein–protein interfaces and its implications , 2004, Protein science : a publication of the Protein Society.

[46]  K. Chou,et al.  Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular Location* , 2002, The Journal of Biological Chemistry.

[47]  Roded Sharan,et al.  Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data , 2004, J. Comput. Biol..

[48]  M. Gerstein,et al.  Integration of genomic datasets to predict protein complexes in yeast , 2004, Journal of Structural and Functional Genomics.

[49]  William Stafford Noble,et al.  Kernel methods for predicting protein-protein interactions , 2005, ISMB.

[50]  Dmitrij Frishman,et al.  MIPS: analysis and annotation of proteins from whole genomes in 2005 , 2005, Nucleic Acids Res..

[51]  T. Barrette,et al.  Probabilistic model of the human protein-protein interaction network , 2005, Nature Biotechnology.

[52]  Mark Gerstein,et al.  Information assessment on predicting protein-protein interactions , 2004, BMC Bioinformatics.