Prediction of protein--protein interaction sites in heterocomplexes with neural networks.

In this paper we address the problem of extracting features relevant for predicting protein--protein interaction sites from the three-dimensional structures of protein complexes. Our approach is based on information about evolutionary conservation and surface disposition. We implement a neural network based system, which uses a cross validation procedure and allows the correct detection of 73% of the residues involved in protein interactions in a selected database comprising 226 heterodimers. Our analysis confirms that the chemico-physical properties of interacting surfaces are difficult to distinguish from those of the whole protein surface. However neural networks trained with a reduced representation of the interacting patch and sequence profile are sufficient to generalize over the different features of the contact patches and to predict whether a residue in the protein surface is or is not in contact. By using a blind test, we report the prediction of the surface interacting sites of three structural components of the Dnak molecular chaperone system, and find close agreement with previously published experimental results. We propose that the predictor can significantly complement results from structural and functional proteomics.

[1]  C. Chothia,et al.  Principles of protein–protein recognition , 1975, Nature.

[2]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[3]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[4]  Geoffrey J. Barton,et al.  Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation , 1993, Comput. Appl. Biosci..

[5]  B. Rost,et al.  Conservation and prediction of solvent accessibility in protein families , 1994, Proteins.

[6]  C. Sander,et al.  A method to predict functional residues in proteins , 1995, Nature Structural Biology.

[7]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[8]  S. Jones,et al.  Principles of protein-protein interactions. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[9]  K. Wüthrich,et al.  NMR structure of the J-domain and the Gly/Phe-rich region of the Escherichia coli DnaJ chaperone. , 1996, Journal of molecular biology.

[10]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[11]  Craig M. Ogata,et al.  Structural Analysis of Substrate Binding by the Molecular Chaperone DnaK , 1996, Science.

[12]  A. Valencia,et al.  Correlated mutations contain information about protein-protein interaction. , 1997, Journal of molecular biology.

[13]  S. Jones,et al.  Analysis of protein-protein interaction sites using surface patches. , 1997, Journal of molecular biology.

[14]  J Kuriyan,et al.  Crystal structure of the nucleotide exchange factor GrpE bound to the ATPase domain of the molecular chaperone DnaK. , 1997, Science.

[15]  S. Jones,et al.  Prediction of protein-protein interaction sites using patch analysis. , 1997, Journal of molecular biology.

[16]  Chris Sander,et al.  The HSSP database of protein structure-sequence alignments and family profiles , 1998, Nucleic Acids Res..

[17]  Bernd Bukau,et al.  The Hsp70 and Hsp60 Chaperone Machines , 1998, Cell.

[18]  A Valencia,et al.  Mutations in the DnaK chaperone affecting interaction with the DnaJ cochaperone. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[19]  A. Horwich,et al.  The Hsp 70 and Hsp 60 Review Chaperone Machines , 1998 .

[20]  M J Sternberg,et al.  Predictive docking of protein-protein and protein-DNA complexes. , 1998, Current opinion in structural biology.

[21]  C. Gross,et al.  Interaction of the Hsp70 molecular chaperone, DnaK, with its cochaperone DnaJ. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[22]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[23]  A. Mendelsohn,et al.  Protein Interaction Methods-Toward an Endgame , 1999, Science.

[24]  C. Chothia,et al.  The atomic structure of protein-protein recognition sites. , 1999, Journal of molecular biology.

[25]  L. Gierasch,et al.  Mutations in the substrate binding domain of the Escherichia coli 70 kDa molecular chaperone, DnaK, which alter substrate affinity or interdomain coupling. , 1999, Journal of molecular biology.

[26]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[27]  W. Blackstock,et al.  Proteomics: quantitative and physical mapping of cellular proteins. , 1999, Trends in biotechnology.

[28]  E. Craig,et al.  Intragenic suppressors of Hsp70 mutants: interplay between the ATPase- and peptide-binding domains. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[29]  A. Thomas,et al.  A fast method to predict protein interaction sites from sequences. , 2000, Journal of molecular biology.

[30]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[31]  R. Norel,et al.  Electrostatic aspects of protein-protein interactions. , 2000, Current opinion in structural biology.

[32]  M. Vidal,et al.  Protein interaction mapping in C. elegans using proteins involved in vulval development. , 2000, Science.

[33]  D. Eisenberg,et al.  Protein function in the post-genomic era , 2000, Nature.

[34]  J. Thornton,et al.  Discriminating between homodimeric and monomeric proteins in the crystalline state , 2000, Proteins.

[35]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[36]  A. Aronheim,et al.  A novel approach for the identification of protein-protein interaction with integral membrane proteins. , 2001, Nucleic acids research.

[37]  J. Wojcik,et al.  The protein–protein interaction map of Helicobacter pylori , 2001, Nature.

[38]  David A. Gough,et al.  Predicting protein-protein interactions from primary structure , 2001, Bioinform..

[39]  Gary D Bader,et al.  BIND--The Biomolecular Interaction Network Database. , 2001, Nucleic acids research.

[40]  J. Thornton,et al.  Protein–protein interfaces: Analysis of amino acid conservation in homodimers , 2001, Proteins.

[41]  N. Ben-Tal,et al.  Residue frequencies and pairing preferences at protein–protein interfaces , 2001, Proteins.

[42]  Huan‐Xiang Zhou,et al.  Prediction of protein interaction sites from sequence profile and residue neighbor list , 2001, Proteins.

[43]  Ronald,et al.  Learning representations by backpropagating errors , 2004 .