A united‐residue force field for off‐lattice protein‐structure simulations. I. Functional forms and parameters of long‐range side‐chain interaction potentials from protein crystal data

A two‐stage procedure for the determination of a united‐residue potential designed for protein simulations is outlined. In the first stage, the long‐range and local‐interaction energy terms of the total energy of a polypeptide chain are determined by analyzing protein‐crystal data and averaging the all‐atom energy surfaces. In the second stage (described in the accompanying article), the relative weights of the energy terms are optimized so as to locate the native structures of selected test proteins as the lowest energy structures. The goal of the work in the present study is to parameterize physically reasonable functional forms of the potentials of mean force for side‐chain interactions. The potentials are of both radial and anisotropic type. Radial potentials include the Lennard‐Jones and the shifted Lennard‐Jones potential (with the shift parameter independent of orientation). To treat the angular dependence of side‐chain interactions, three functional forms of the potential that were designed previously to describe anisotropic systems are evaluated: Berne‐Pechukas (dilated Lennard‐Jones); Gay‐Berne (shifted Lennard‐Jones with orientation‐dependent shift parameters); and Gay‐Berne‐Vorobjev (the same as the preceding one, but with one more set of variable parameters). These functional forms were used to parameterize, within a short‐distance range, the potentials of mean force for side‐chain pair interactions that are related by the Boltzmann principle to the pair correlation functions determined from protein‐crystal data. Parameter determination was formulated as a generalized nonlinear least‐squares problem with the target function being the weighted sum of squares of the differences between calculated and “experimental” (i.e., estimated from protein‐crystal data) angular, radial‐angular, and radial pair correlation functions, as well as contact free energies. A set of 195 high‐resolution nonhomologous structures from the Protein Data Bank was used to calculate the “experimental” values. The contact free energies were scaled by the slope of the correlation line between side‐chain hydrophobicities, calculated from the contact free energies, and those determined by Fauchere and Pliška from the partition coefficients of amino acids between water and n‐octanol. The methylene group served to define the reference contact free energy corresponding to that between the glycine methylene groups of backbone residues. Statistical analysis of the goodness of fit revealed that the Gay‐Berne‐Vorobjev anisotropic potential fits best to the experimental radial and angular correlation functions and contact free energies and therefore represents the free‐energy surface of side‐chain‐side‐chain interactions most accurately. Thus, its choice for simulations of protein structure is probably the most appropriate. However, the use of simpler functional forms is recommended, if the speed of computations is an issue. © 1997 by John Wiley & Sons, Inc. J Comput Chem 18: 849–873, 1997

[1]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[2]  C. Tanford,et al.  The solubility of amino acids and two glycine peptides in aqueous ethanol and dioxane solutions. Establishment of a hydrophobicity scale. , 1971, The Journal of biological chemistry.

[3]  Bruce J. Berne,et al.  Gaussian Model Potentials for Molecular Interactions , 1972 .

[4]  M. Levitt,et al.  Computer simulation of protein folding , 1975, Nature.

[5]  H. Scheraga,et al.  Energy parameters in polypeptides. VII. Geometric parameters, partial atomic charges, nonbonded interactions, hydrogen bond interactions, and intrinsic torsional potentials for the naturally occurring amino acids , 1975 .

[6]  M. Levitt A simplified representation of protein conformations for rapid simulation of protein folding. , 1976, Journal of molecular biology.

[7]  H. Scheraga,et al.  Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. , 1976, Macromolecules.

[8]  Harold A. Scheraga,et al.  An approximate treatment of long-range interactions in proteins , 1977 .

[9]  M. Yčas,et al.  On the computation of the tertiary structure of globular proteins II. , 1978, Journal of theoretical biology.

[10]  B. Berne Modification of the overlap potential to mimic a linear site-site potential , 1981 .

[11]  Hiroshi Wako,et al.  Distance-constraint approach to protein folding. II. Prediction of three-dimensional structure of bovine pancreatic trypsin inhibitor , 1982 .

[12]  G. R. Luckhurst,et al.  Computer simulation studies of anisotropic systems: VIII. The Lebwohl-Lasher model of nematogens revisited , 1982 .

[13]  Hiroshi Wako,et al.  Distance-constraint approach to protein folding. I. Statistical analysis of protein conformations in terms of distances between residues , 1982 .

[14]  Jack D. Dunitz,et al.  From crystal statics to chemical dynamics , 1983 .

[15]  H. Scheraga,et al.  Energy parameters in polypeptides. 9. Updating of geometrical parameters, nonbonded interactions, and hydrogen bond interactions for the naturally occurring amino acids , 1983 .

[16]  A potential function for conformational analysis of proteins. , 2009, International journal of peptide and protein research.

[17]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[18]  V N Viswanadhan,et al.  Sidechain and backbone potential function for conformational analysis of proteins. , 1985, International journal of peptide and protein research.

[19]  D. Lipman,et al.  Rapid and sensitive protein similarity searches. , 1985, Science.

[20]  G. Crippen,et al.  Determination of an empirical energy function for protein conformational analysis by energy embedding , 1987 .

[21]  H. Scheraga,et al.  Monte Carlo-minimization approach to the multiple-minima problem in protein folding. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[22]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Harold A. Scheraga,et al.  Structure and free energy of complex thermodynamic systems , 1988 .

[24]  H. Scheraga,et al.  A comparison of the CHARMM, AMBER and ECEPP potentials for peptides. II. Phi-psi maps for N-acetyl alanine N'-methyl amide: comparisons, contrasts and simple experimental tests. , 1989, Journal of biomolecular structure & dynamics.

[25]  S. Doniach,et al.  A computer model to dynamically simulate protein folding: Studies with crambin , 1989, Proteins.

[26]  Y. Vorobjev,et al.  Block‐units method for conformational calculations of large nucleic acid chains. I. Block‐units approximation of atomic structure and conformational energy of polynucleotides , 1990 .

[27]  F E Cohen,et al.  Novel method for the rapid evaluation of packing in protein structures. , 1990, Journal of molecular biology.

[28]  G. R. Luckhurst,et al.  Computer simulation studies of anisotropic systems. XIX. Mesophases formed by the Gay-Berne model mesogen , 1990 .

[29]  K. Dill Dominant forces in protein folding. , 1990, Biochemistry.

[30]  Y. Vorobjev Block‐units method for conformational calculations of large nucleic acid chains. II. The two‐hierarchical approach and its application to conformational arrangement of the unusual TψC loop of rabbit tRNAval , 1990, Biopolymers.

[31]  David A. Ratkowsky,et al.  Handbook of nonlinear regression models , 1990 .

[32]  A. Kolinski,et al.  Simulations of the Folding of a Globular Protein , 1990, Science.

[33]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[34]  G M Crippen,et al.  A 1.8 Å resolution potential function for protein folding , 1990, Biopolymers.

[35]  P. Seetharamulu,et al.  A potential function for protein folding , 1991 .

[36]  G. R. Luckhurst,et al.  Computer simulation studies of anisotropic systems. XX: On the validity of the Maier-Saupe approximations for the Gay-Berne nematogen , 1992 .

[37]  D. Covell Folding protein α‐carbon chains into compact forms by monte carlo methods , 1992 .

[38]  H. Scheraga,et al.  Application of the diffusion equation method for global optimization to oligopeptides , 1992 .

[39]  Paul R. Gerber,et al.  Peptide mechanics: A force field for peptides and proteins working with entire residues as smallest units , 1992 .

[40]  J. Skolnick,et al.  Discretized model of proteins. I. Monte Carlo study of cooperativity in homopolypeptides , 1992 .

[41]  J. Edelman,et al.  Pair distribution functions in small systems: Implications for protein structure analysis , 1992, Biopolymers.

[42]  G. Crippen,et al.  Contact potential that recognizes the correct folding of globular proteins. , 1992, Journal of molecular biology.

[43]  M J Sippl,et al.  Structure-derived hydrophobic potential. Hydrophobic potential derived from X-ray structures of globular proteins is able to identify native folds. , 1992, Journal of molecular biology.

[44]  P. Wolynes,et al.  Protein tertiary structure recognition using optimized Hamiltonians with local interactions. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[45]  M. Hao,et al.  Effects of compact volume and chain stiffness on the conformations of native proteins. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[46]  S. Rackovsky,et al.  Prediction of protein conformation on the basis of a search for compact structures: Test on avian pancreatic polypeptide , 1993, Protein science : a publication of the Protein Society.

[47]  Adam Godzik,et al.  A method for predicting protein structure from sequence , 1993, Current Biology.

[48]  S. Rackovsky,et al.  Calculation of protein backbone geometry from α‐carbon coordinates based on peptide‐group dipole alignment , 1993, Protein science : a publication of the Protein Society.

[49]  Lucjan Piela,et al.  Mean field theory as a tool for intramolecular conformational optimization. 3. Test on melittin , 1993 .

[50]  Adam Godzik,et al.  De novo and inverse folding predictions of protein structure and dynamics , 1993, J. Comput. Aided Mol. Des..

[51]  Y. Matsuo,et al.  Development of pseudoenergy potentials for assessing protein 3-D-1-D compatibility and detecting weak homologies. , 1993, Protein engineering.

[52]  E. Shakhnovich,et al.  Engineering of stable and fast-folding sequences of model proteins. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[53]  A. Godzik,et al.  A general method for the prediction of the three dimensional structure and folding pathway of globular proteins: Application to designed helical proteins , 1993 .

[54]  Manfred J. Sippl,et al.  Boltzmann's principle, knowledge-based mean fields and protein folding. An approach to the computational determination of protein structures , 1993, J. Comput. Aided Mol. Des..

[55]  J Skolnick,et al.  Computer modeling and folding of four‐helix bundles , 1993, Proteins.

[56]  S. Sun,et al.  Reduced representation model of protein structure prediction: Statistical potential and genetic algorithms , 1993, Protein science : a publication of the Protein Society.

[57]  J. Skolnick,et al.  Computer simulation of the folding of coiled coils , 1994 .

[58]  J. Skolnick,et al.  Monte carlo simulations of protein folding. I. Lattice model and interaction scheme , 1994, Proteins.

[59]  G. Crippen,et al.  Learning about protein folding via potential functions , 1994, Proteins.

[60]  A Kolinski,et al.  Prediction of the folding pathways and structure of the GCN4 leucine zipper. , 1994, Journal of molecular biology.

[61]  J. Skolnick,et al.  Monte carlo simulations of protein folding. II. Application to protein A, ROP, and crambin , 1994, Proteins.

[62]  H. Scheraga,et al.  Contribution of unusual Arginine-Arginine short-range interactions to stabilization and recognition in proteins , 1994, Journal of protein chemistry.

[63]  H. H. Gan,et al.  Integral equation theory of polymers: Translational invariance approximation and properties of an isolated linear polymer in solution , 1994 .

[64]  A. Wallqvist,et al.  A simplified amino acid potential for use in structure predictions of proteins , 1994, Proteins.

[65]  Harold A. Scheraga,et al.  MONTE CARLO SIMULATION OF A FIRST-ORDER TRANSITION FOR PROTEIN FOLDING , 1994 .

[66]  A. Godzik,et al.  Are proteins ideal mixtures of amino acids? Analysis of energy parameter sets , 1995, Protein science : a publication of the Protein Society.