Learning about protein hydrogen bonding by minimizing contrastive divergence

Defining the strength and geometry of hydrogen bonds in protein structures has been a challenging task since early days of structural biology. In this article, we apply a novel statistical machine learning technique, known as contrastive divergence, to efficiently estimate both the hydrogen bond strength and the geometric characteristics of strong interpeptide backbone hydrogen bonds, from a dataset of structures representing a variety of different protein folds. Despite the simplifying assumptions of the interatomic energy terms used, we determine the strength of these hydrogen bonds to be between 1.1 and 1.5 kcal/mol, in good agreement with earlier experimental estimates. The geometry of these strong backbone hydrogen bonds features an almost linear arrangement of all four atoms involved in hydrogen bond formation. We estimate that about a quarter of all hydrogen bond donors and acceptors participate in these strong interpeptide hydrogen bonds. Proteins 2007. © 2006 Wiley‐Liss, Inc.

[1]  M J Sippl,et al.  Knowledge-based potentials for proteins. , 1995, Current opinion in structural biology.

[2]  D E Tronrud,et al.  TNT refinement package. , 1997, Methods in enzymology.

[3]  N. Balakrishnan,et al.  Binomial and Negative Binomial Analogues under Correlated Bernoulli Trials , 1994 .

[4]  D. Shortle Propensities, probabilities, and the Boltzmann hypothesis , 2003, Protein science : a publication of the Protein Society.

[5]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[6]  K. Dill,et al.  Hydrogen bonding in globular proteins. , 1992, Journal of molecular biology.

[7]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[8]  G. Sheldrick,et al.  SHELXL: high-resolution refinement. , 1997, Methods in enzymology.

[9]  G. J.,et al.  Refinement of Large Structures by Simultaneous Minimization of Energy and R Factor , 1978 .

[10]  G. Rose,et al.  Do all backbone polar groups in proteins form hydrogen bonds? , 2005, Protein science : a publication of the Protein Society.

[11]  C. M. Freeman,et al.  Lost hydrogen bonds and buried surface area: rationalising stability in globular proteins , 1993 .

[12]  G. N. Ramachandran,et al.  Stereochemistry of polypeptide chain configurations. , 1963, Journal of molecular biology.

[13]  L. Pauling,et al.  The pleated sheet, a new layer configuration of polypeptide chains. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Alexander J. Marchut,et al.  Solvent effects on the conformational transition of a model polyalanine peptide , 2004, Protein science : a publication of the Protein Society.

[15]  G. Murshudov,et al.  Refinement of macromolecular structures by the maximum-likelihood method. , 1997, Acta crystallographica. Section D, Biological crystallography.

[16]  V S Lamzin,et al.  Automated refinement for protein crystallography. , 1997, Methods in enzymology.

[17]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[18]  Axel T. Brunger,et al.  X-PLOR Version 3.1: A System for X-ray Crystallography and NMR , 1992 .

[19]  J. Schellman The stability of hydrogen-bonded peptide structures in aqueous solution. , 1955, Comptes rendus des travaux du Laboratoire Carlsberg. Serie chimique.

[20]  R J Read,et al.  Crystallography & NMR system: A new software suite for macromolecular structure determination. , 1998, Acta crystallographica. Section D, Biological crystallography.

[21]  K. Gabriel,et al.  THE DISTRIBUTION OF THE NUMBER OF SUCCESSES IN A SEQUENCE OF DEPENDENT TRIALS , 1959 .

[22]  F. Pohl Empirical protein energy maps. , 1971, Nature: New biology.

[23]  Acr Martin,et al.  Amino Acid Pairing Preferences in Parallel β-Sheets in Proteins , 2006 .

[24]  K. Dill,et al.  The flexibility in the proline ring couples to the protein backbone , 2005, Protein science : a publication of the Protein Society.

[25]  G. Rose,et al.  Hydrogen bonding, hydrophobicity, packing, and protein folding. , 1993, Annual review of biophysics and biomolecular structure.

[26]  Wayne A. Hendrickson,et al.  A restrained-parameter thermal-factor refinement procedure , 1980 .

[27]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[28]  Richard Bertram,et al.  An improved hydrogen bond potential: Impact on medium resolution protein structures , 2002, Protein science : a publication of the Protein Society.

[29]  Patrice Koehl,et al.  The ASTRAL compendium for protein structure and sequence analysis , 2000, Nucleic Acids Res..

[30]  O. Clay Standard deviations and correlations of GC levels in DNA sequences. , 2001, Gene.

[31]  C. Perrin,et al.  "Strong" hydrogen bonds in chemistry and biology. , 1997, Annual review of physical chemistry.

[32]  L. Pauling,et al.  The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[33]  David L Wild,et al.  Exhaustive Metropolis Monte Carlo sampling and analysis of polyalanine conformations adopted under the influence of hydrogen bonds , 2005, Proteins.

[34]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[35]  R. Srinivasan,et al.  The Flory isolated-pair hypothesis is not valid for polypeptide chains: implications for protein folding. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[36]  D. Baker,et al.  An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. , 2003, Journal of molecular biology.

[37]  E. Baker,et al.  Hydrogen bonding in globular proteins. , 1984, Progress in biophysics and molecular biology.

[38]  C. Pace,et al.  Hydrogen bonding stabilizes globular proteins. , 1996, Biophysical journal.

[39]  J. Thornton,et al.  Determinants of strand register in antiparallel β‐sheets of proteins , 1998, Protein science : a publication of the Protein Society.

[40]  A. Krogh,et al.  Teaching computers to fold proteins. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  A. J. Hopfinger,et al.  Conformational Properties of Macromolecules , 1973 .

[42]  H. Scheraga,et al.  Helix-coil transitions re-visited. , 2002, Biophysical chemistry.

[43]  M. Zalis,et al.  Visualizing and quantifying molecular goodness-of-fit: small-probe contact dots with explicit hydrogen atoms. , 1999, Journal of molecular biology.

[44]  R. Huber,et al.  Accurate Bond and Angle Parameters for X-ray Protein Structure Refinement , 1991 .

[45]  J. Thornton,et al.  Satisfying hydrogen bonding potential in proteins. , 1994, Journal of molecular biology.

[46]  R. L. Baldwin In Search of the Energetic Role of Peptide Hydrogen Bonds , 2003, The Journal of Biological Chemistry.

[47]  D. Baker,et al.  Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations. , 2004, Proceedings of the National Academy of Sciences of the United States of America.