A consistent set of statistical potentials for quantifying local side‐chain and backbone interactions

The frequencies of occurrence of atom arrangements in high‐resolution protein structures provide some of the most accurate quantitative measures of interaction energies in proteins. In this report we extend our development of a consistent set of statistical potentials for quantifying local interactions between side‐chains and the polypeptide backbone, as well as nearby side‐chains. Starting with ϕ/ψ/χ1 propensities that select for optimal interactions of the 20 amino acid side‐chains with the 2 flanking peptide bonds, the following 3 new terms are added: (1) a distance‐dependent interaction between the side‐chain at i and the carbonyl oxygens and amide protons of the peptide units at i ± 2, i ± 3, and i ± 4; (2) a distance‐dependent interaction between the side‐chain at position i and side‐chains at positions i + 1 through i + 4; and (3) an orientation‐dependent interaction between the side‐chain at position i and side‐chains at i + 1 through i + 4. The relative strengths of these 4 pseudo free energy terms are estimated by the average information content of each scoring matrix and by assessing their performance in a simple fragment threading test. They vary from −0.4–−0.5 kcal/mole per residue for ϕ/ψ/χ1 propensities to a range of −0.15–−0.6 kcal/mole per residue for each of the other 3 terms. The combined energy function, containing no interactions between atoms more than 4 residues apart, identifies the correct structural fragment for randomly selected 15 mers over 40% of the time, after searching through 232,000 alternative conformations. For 14 out of 20 sets of all‐atom Rosetta decoys analyzed, the native structure has a combined score lower than any of the 1700–1900 decoy conformations. The ability of this energy function to detect energetically important details of local structure is demonstrated by its power to distinguish high‐resolution crystal structures from NMR solution structures. Proteins 2005. © 2005 Wiley‐Liss, Inc.

[1]  J Moult,et al.  Comparison of database potentials and molecular mechanics force fields. , 1997, Current opinion in structural biology.

[2]  Tim J. P. Hubbard,et al.  SCOP database in 2002: refinements accommodate structural genomics , 2002, Nucleic Acids Res..

[3]  D. Shortle Propensities, probabilities, and the Boltzmann hypothesis , 2003, Protein science : a publication of the Protein Society.

[4]  S. Wodak,et al.  Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches. , 1994, Journal of molecular biology.

[5]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[6]  B Honig,et al.  Adding backbone to protein folding: why proteins are polypeptides. , 1996, Folding & design.

[7]  M J Sippl,et al.  Knowledge-based potentials for proteins. , 1995, Current opinion in structural biology.

[8]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[9]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[10]  F. Pohl Empirical protein energy maps. , 1971, Nature: New biology.

[11]  C. Levinthal Are there pathways for protein folding , 1968 .

[12]  R. Jernigan,et al.  Structure-derived potentials and protein simulations. , 1996, Current opinion in structural biology.

[13]  H. Scheraga,et al.  A comparison of the CHARMM, AMBER and ECEPP potentials for peptides. II. Phi-psi maps for N-acetyl alanine N'-methyl amide: comparisons, contrasts and simple experimental tests. , 1989, Journal of biomolecular structure & dynamics.

[14]  D. Shortle Composites of local structure propensities: evidence for local encoding of long-range structure. , 2002, Protein science : a publication of the Protein Society.

[15]  D. Shortle,et al.  Robustness of the long-range structure in denatured staphylococcal nuclease to changes in amino acid sequence. , 2002, Biochemistry.

[16]  J. Thornton,et al.  Stereochemical quality of protein structure coordinates , 1992, Proteins.

[17]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[18]  D. Shortle Structural analysis of non-native states of proteins by NMR methods. , 1996, Current opinion in structural biology.

[19]  Andrew L. Lee,et al.  Direct Demonstration of Structural Similarity between Native and Denatured Eglin C † , 2004 .

[20]  G. Clore,et al.  Sources of and solutions to problems in the refinement of protein NMR structures against torsion angle potentials of mean force. , 2000, Journal of magnetic resonance.

[21]  D. Shortle,et al.  Prediction of protein structure by emphasizing local side‐chain/backbone interactions in ensembles of turn fragments , 2003, Proteins.

[22]  Richard Bonneau,et al.  An improved protein decoy set for testing energy functions for protein structure prediction , 2003, Proteins.

[23]  C M Dobson,et al.  Understanding how proteins fold: the lysozyme story so far. , 1994, Trends in biochemical sciences.

[24]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.

[25]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[26]  D. Shortle,et al.  Persistence of Native-Like Topology in a Denatured Protein in 8 M Urea , 2001, Science.

[27]  C. W. Hilbers,et al.  Improving the quality of protein structures derived by NMR spectroscopy** , 2002, Journal of biomolecular NMR.

[28]  H. Scheraga,et al.  Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. , 1976, Macromolecules.

[29]  D. Baker,et al.  Rapid protein fold determination using unassigned NMR data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[30]  D. Baker,et al.  Improved recognition of native‐like protein structures using a combination of sequence‐dependent and sequence‐independent features of proteins , 1999, Proteins.