A matrix based algorithm for Protein-Protein Interaction prediction using Domain-Domain Associations.

Protein-Protein Interactions (PPI) are vital to many cellular processes. The availability of high-throughput protein interaction data has provided us with an opportunity to assess domain associations in interacting proteins using computational approaches. High throughput PPI data, wherein the interaction status of every protein in the dataset has been experimentally tested against all the other proteins in the dataset contains information not only on protein interactions but also on proteins which do not interact with each other. We call such datasets "all against all" datasets. In the current study, using these datasets and the Pfam domain composition of the proteins in the sets, we have developed a matrix based method for predicting PPI. We infer positive and negative Domain-Domain Associations (DDA) by our method. We have generated more than a million domain association values which can be utilized for predicting new PPI. The performance of the algorithm was evaluated against a test set and the sensitivity and specificity was found to be 68.1% and 65.3%, respectively. The overall prediction accuracy of the algorithm with individual test sets from IntAct, DIP, 3did, iPfam databases and a literature curated set from Saccharomyces cerevisiae was found to be around 70%. The insights gained in the study have a potential application in providing leads for experimental interaction studies and understanding host pathogen interactions amongst others.

[1]  K. Aihara,et al.  A discriminative approach for identifying domain–domain interactions from protein–protein interactions , 2010, Proteins.

[2]  S. Gygi,et al.  Serpin 2a Is Induced in Activated Macrophages and Conjugates to a Ubiquitin Homolog1 , 2002, The Journal of Immunology.

[3]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[4]  J. Luban,et al.  Specific incorporation of cyclophilin A into HIV-1 virions , 1994, Nature.

[5]  Jérôme Wojcik,et al.  Protein-protein interaction map inference using interacting domain profile pairs , 2001, ISMB.

[6]  J. Sodroski,et al.  Functional association of cyclophilin A with HIV-1 virions , 1994, Nature.

[7]  William Stafford Noble,et al.  Kernel methods for predicting protein-protein interactions , 2005, ISMB.

[8]  Benjamin A. Shoemaker,et al.  Deciphering Protein–Protein Interactions. Part I. Experimental Techniques and Databases , 2007, PLoS Comput. Biol..

[9]  A. Joachimiak,et al.  Characteristics and crystal structure of bacterial inosine-5'-monophosphate dehydrogenase. , 1999, Biochemistry.

[10]  T. Attwood,et al.  LRRCE: a leucine-rich repeat cysteine capping motif unique to the chordate lineage , 2008, BMC Genomics.

[11]  Jesús A. Izaguirre,et al.  Predicting Protein-Protein Interactions from Protein Domains Using a Set Cover Approach , 2007, IEEE ACM Trans. Comput. Biol. Bioinform..

[12]  L. Johnson,et al.  The structural basis for specificity of substrate and recruitment peptides for cyclin-dependent kinases , 1999, Nature Cell Biology.

[13]  David A. Gough,et al.  Predicting protein-protein interactions from primary structure , 2001, Bioinform..

[14]  N. Shikama,et al.  Functional Interaction between Nucleosome Assembly Proteins and p300/CREB-Binding Protein Family Coactivators , 2000, Molecular and Cellular Biology.

[15]  Nianjun Liu,et al.  Inferring protein-protein interactions through high-throughput interaction data from diverse organisms , 2005, Bioinform..

[16]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[17]  K. Gunsalus,et al.  Empirically controlled mapping of the Caenorhabditis elegans protein-protein interactome network , 2009, Nature Methods.

[18]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[19]  Qing Jiang,et al.  Roles of Aurora Kinases in Mitosis and Tumorigenesis , 2007, Molecular Cancer Research.

[20]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Ping Wang,et al.  Structure of a c-Cbl–UbcH7 Complex RING Domain Function in Ubiquitin-Protein Ligases , 2000, Cell.

[22]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[23]  See-Kiong Ng,et al.  Integrative Approach for Computationally Inferring Protein Domain Interactions , 2003, Bioinform..

[24]  B. Kemp Bateman domains and adenosine derivatives form a binding contract. , 2004, The Journal of clinical investigation.

[25]  Minghua Deng,et al.  Inferring Domain–Domain Interactions From Protein–Protein Interactions , 2002 .

[26]  Benjamin A. Shoemaker,et al.  Deciphering Protein–Protein Interactions. Part II. Computational Methods to Predict Protein and Domain Interaction Partners , 2007, PLoS Comput. Biol..

[27]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[28]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[29]  Arnaud Céol,et al.  3did: identification and classification of domain-based interactions of known three-dimensional structure , 2010, Nucleic Acids Res..

[30]  Robert D. Finn,et al.  The Pfam protein families database , 2004, Nucleic Acids Res..

[31]  R. Overbeek,et al.  The use of gene clusters to infer functional coupling. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[32]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[33]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[34]  Mei Liu,et al.  Prediction of protein-protein interactions using random decision forest framework , 2005, Bioinform..

[35]  Rachael P. Huntley,et al.  The UniProt-GO Annotation database in 2011 , 2011, Nucleic Acids Res..

[36]  M. Gerstein,et al.  A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data , 2003, Science.

[37]  H. Rasmussen,et al.  Cyclic Nucleotide-dependent Vasorelaxation Is Associated with the Phosphorylation of a Small Heat Shock-related Protein* , 1997, The Journal of Biological Chemistry.

[38]  W. Sundquist,et al.  Crystal structure of cyclophilin A complexed with a binding site peptide from the HIV‐1 capsid protein , 1997, Protein science : a publication of the Protein Society.

[39]  Mudita Singhal,et al.  A domain-based approach to predict protein-protein interactions , 2007, BMC Bioinformatics.

[40]  Paul G Scott,et al.  Crystal structure of the dimeric protein core of decorin, the archetypal small leucine-rich repeat proteoglycan. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Tom M. W. Nye,et al.  Statistical analysis of domains in interacting protein pairs , 2005, Bioinform..

[42]  Luonan Chen,et al.  Analysis on multi-domain cooperation for predicting protein-protein interactions , 2007, BMC Bioinformatics.

[43]  Paul G Scott,et al.  Crystal Structure of the Biglycan Dimer and Evidence That Dimerization Is Essential for Folding and Stability of Class I Small Leucine-rich Repeat Proteoglycans* , 2006, Journal of Biological Chemistry.

[44]  Jiangning Song,et al.  Conditional random field approach to prediction of protein-protein interactions using domain information , 2011, BMC Systems Biology.

[45]  Peter B. McGarvey,et al.  The Protein Information Resource (PIR) , 2000, Nucleic Acids Res..

[46]  Wan Kyu Kim,et al.  Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair. , 2002, Genome informatics. International Conference on Genome Informatics.

[47]  V. de Laurenzi,et al.  BAG3: a multifaceted protein that regulates major cell pathways , 2011, Cell Death and Disease.

[48]  Raja Jothi,et al.  Co-evolutionary analysis of domains in interacting proteins reveals insights into domain-domain interactions mediating protein-protein interactions. , 2006, Journal of molecular biology.

[49]  E. Sprinzak,et al.  Correlated sequence-signatures as markers of protein-protein interaction. , 2001, Journal of molecular biology.

[50]  E. van Nimwegen,et al.  Accurate Prediction of Protein–protein Interactions from Sequence Alignments Using a Bayesian Method , 2022 .

[51]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[52]  Christopher J. Lee,et al.  Inferring protein domain interactions from databases of interacting proteins , 2005, Genome Biology.

[53]  B. Kobe,et al.  The leucine-rich repeat as a protein recognition motif. , 2001, Current opinion in structural biology.

[54]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.