Biological units and their effect upon the properties and prediction of protein-protein interactions.

Structural data as collated in the Protein Data Bank (PDB) have been widely applied in the study and prediction of protein-protein interactions. However, since the basic PDB Entries contain only the contents of the asymmetric unit rather than the biological unit, some key interactions may be missed by analysing only the PDB Entry. A total of 69,054 SCOP (Structural Classification of Proteins) domains were examined systematically to identify the number of additional novel interacting domain pairs and interfaces found by considering the biological unit as stored in the PQS (Protein Quaternary Structure) database. The PQS data adds 25,965 interacting domain pairs to those seen in the PDB Entries to give a total of 61,783 redundant interacting domain pairs. Redundancy filtering at the level of the SCOP family shows PQS to increase the number of novel interacting domain-family pairs by 302 (13.3%) from 2277, but only 16/302 (1.4%) of the interacting domain pairs have the two domains in different SCOP families. This suggests the biological units add little to the elucidation of novel biological interaction networks. However, when the orientation of the domain pairs is considered, the PQS data increases the number of novel domain-domain interfaces observed by 1455 (34.5%) to give 5677 non-redundant domain-domain interfaces. In all, 162/1455 novel domain-domain interfaces are between domains from different families, an increase of 8.9% over the PDB Entries. Overall, the PQS biological units provide a rich source of novel domain-domain interfaces that are not seen in the studied PDB Entries, and so PQS domain-domain interaction data should be exploited wherever possible in the analysis and prediction of protein-protein interactions.

[1]  J. Thornton,et al.  Diversity of protein–protein interactions , 2003, The EMBO journal.

[2]  Jong H. Park,et al.  Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the PDB and yeast. , 2001, Journal of molecular biology.

[3]  Susan S. Taylor,et al.  Crystal structures of the myristylated catalytic subunit of cAMP‐dependent protein kinase reveal open and closed conformations , 1993, Protein science : a publication of the Protein Society.

[4]  Gary D. Bader,et al.  BIND-a data specification for storing and describing biomolecular interactions, molecular complexes and pathways , 2000, Bioinform..

[5]  See-Kiong Ng,et al.  Integrative Approach for Computationally Inferring Protein Domain Interactions , 2003, Bioinform..

[6]  Dan M. Bolser,et al.  Visualisation and graph-theoretic analysis of a large-scale protein structural interactome , 2003, BMC Bioinformatics.

[7]  Tim J. P. Hubbard,et al.  SCOP: a Structural Classification of Proteins database , 1999, Nucleic Acids Res..

[8]  William Stafford Noble,et al.  Learning to predict protein-protein interactions from protein sequences , 2003, Bioinform..

[9]  T. A. Link,et al.  Complete structure of the 11-subunit bovine mitochondrial cytochrome bc1 complex. , 1998, Science.

[10]  Ioannis Xenarios,et al.  Mining literature for protein-protein interactions , 2001, Bioinform..

[11]  A. Valencia,et al.  Similarity of phylogenetic trees as indicator of protein-protein interaction. , 2001, Protein engineering.

[12]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[13]  J. Thornton,et al.  Discriminating between homodimeric and monomeric proteins in the crystalline state , 2000, Proteins.

[14]  Tom M. W. Nye,et al.  Statistical analysis of domains in interacting protein pairs , 2005, Bioinform..

[15]  M. Gerstein,et al.  Global Analysis of Protein Activities Using Proteome Chips , 2001, Science.

[16]  R. Russell,et al.  The relationship between sequence and interaction divergence in proteins. , 2003, Journal of molecular biology.

[17]  Sandor Vajda,et al.  CAPRI: A Critical Assessment of PRedicted Interactions , 2003, Proteins.

[18]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[19]  H. Michel,et al.  The Cytochrome c Oxidase from Paracoccus denitrificans Does Not Change the Metal Center Ligation upon Reduction* , 1999, The Journal of Biological Chemistry.

[20]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[21]  R. Fletterick,et al.  The oligomeric structure of human granzyme A is a determinant of its extended substrate specificity , 2003, Nature Structural Biology.

[22]  D. Eisenberg,et al.  Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. , 2001, Journal of molecular biology.

[23]  S. Jones,et al.  Prediction of protein-protein interaction sites using patch analysis. , 1997, Journal of molecular biology.

[24]  G. Barton,et al.  Multiple protein sequence alignment from tertiary structure comparison: Assignment of global and residue confidence levels , 1992, Proteins.

[25]  S. Wodak,et al.  Assessment of CAPRI predictions in rounds 3–5 shows progress in docking procedures , 2005, Proteins.

[26]  J M Thornton,et al.  Conservation helps to identify biologically relevant crystal contacts. , 2001, Journal of molecular biology.

[27]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[28]  Patrick Aloy,et al.  Ten thousand interactions for the molecular biologist , 2004, Nature Biotechnology.

[29]  Oliviero Carugo,et al.  Protein—protein crystal‐packing contacts , 1997, Protein science : a publication of the Protein Society.

[30]  Patrick Aloy,et al.  Interrogating protein interaction networks through structural biology , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[32]  Nebojsa Mirkovic,et al.  ModBase, a database of annotated comparative protein structure models, and associated resources , 2010, Nucleic Acids Res..

[33]  T. N. Bhat,et al.  The Protein Data Bank: unifying the archive , 2002, Nucleic Acids Res..

[34]  Ozlem Keskin,et al.  Prediction of protein-protein interactions by combining structure and sequence conservation in protein interfaces , 2005, Bioinform..

[35]  J. Thornton,et al.  PQS: a protein quaternary structure file server. , 1998, Trends in biochemical sciences.

[36]  S. Fields,et al.  A novel genetic system to detect protein–protein interactions , 1989, Nature.

[37]  Christian von Mering,et al.  STRING: a database of predicted functional associations between proteins , 2003, Nucleic Acids Res..

[38]  Sameer Velankar,et al.  E-MSD: an integrated data resource for bioinformatics , 2004, Nucleic Acids Res..

[39]  C. Chothia The nature of the accessible and buried surfaces in proteins. , 1976, Journal of molecular biology.

[40]  T. Takagi,et al.  Prediction of protein-protein interaction sites using support vector machines. , 2004, Protein engineering, design & selection : PEDS.

[41]  Hui Lu,et al.  MULTIPROSPECTOR: An algorithm for the prediction of protein–protein interactions by multimeric threading , 2002, Proteins.

[42]  L. Esser,et al.  The crystal structure of mitochondrial cytochrome bc1 in complex with famoxadone: the role of aromatic-aromatic interaction in inhibition. , 2002, Biochemistry.

[43]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[44]  Alexandre M J J Bonvin,et al.  Flexible protein-protein docking. , 2006, Current opinion in structural biology.

[45]  S. Hubbard,et al.  Conservation of orientation and sequence in protein domain--domain interactions. , 2005, Journal of molecular biology.

[46]  Robert Huber,et al.  Crystal structure of DegP (HtrA) reveals a new protease-chaperone machine , 2002, Nature.

[47]  Wan Kyu Kim,et al.  Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair. , 2002, Genome informatics. International Conference on Genome Informatics.

[48]  Geoffrey J. Barton,et al.  SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein–Protein Interactions , 2007, Nucleic Acids Res..

[49]  Robert D. Finn,et al.  The Pfam protein families database , 2004, Nucleic Acids Res..

[50]  R. Overbeek,et al.  The use of gene clusters to infer functional coupling. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[51]  A. Valencia,et al.  In silico two‐hybrid system for the selection of physically interacting protein pairs , 2002, Proteins.