A novel high resolution CαCα distance dependent force field based on a high quality decoy set

This work presents a novel CαCα distance dependent force field which is successful in selecting native structures from an ensemble of high resolution near‐native conformers. An enhanced and diverse protein set, along with an improved decoy generation technique, contributes to the effectiveness of this potential. High quality decoys were generated for 1489 nonhomologous proteins and used to train an optimization based linear programming formulation. The goal in developing a set of high resolution decoys was to develop a simple, distance‐dependent force field that yields the native structure as the lowest energy structure and assigns higher energies to decoy structures that are quite similar as well as those that are less similar. The model also includes a set of physical constraints that were based on experimentally observed physical behavior of the amino acids. The force field was tested on two sets of test decoys not in the training set and was found to excel on all the metrics that are widely used to measure the effectiveness of a force field. The high resolution force field was successful in correctly identifying 113 native structures out of 150 test cases and the average rank obtained for this test was 1.87. All the high resolution structures (training and testing) used for this work are available online and can be downloaded from http://titan.princeton.edu/HRDecoys. Proteins 2006. © 2006 Wiley‐Liss, Inc.

[1]  N. Linial,et al.  On the design and analysis of protein folding potentials , 2000, Proteins.

[2]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[3]  A. Liwo,et al.  Parametrization of Backbone−Electrostatic and Multibody Contributions to the UNRES Force Field for Protein-Structure Prediction from Ab Initio Energy Surfaces of Model Systems† , 2004 .

[4]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[5]  H. Scheraga,et al.  Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. , 1976, Macromolecules.

[6]  M. Levitt,et al.  Exploring conformational space with a simple lattice model for protein structure. , 1994, Journal of molecular biology.

[7]  J. Skolnick,et al.  A distance‐dependent atomic knowledge‐based potential for improved protein structure selection , 2001, Proteins.

[8]  J. Skolnick,et al.  Automated structure prediction of weakly homologous proteins on a genomic scale. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[9]  David Baker,et al.  Analysis of anisotropic side-chain packing in proteins and application to high-resolution structure prediction. , 2004, Journal of molecular biology.

[10]  R. Jernigan,et al.  Structure-derived potentials and protein simulations. , 1996, Current opinion in structural biology.

[11]  M. Levitt,et al.  Energy functions that discriminate X-ray and near native folds from well-constructed decoys. , 1996, Journal of molecular biology.

[12]  H. Scheraga,et al.  Energy parameters in polypeptides. VII. Geometric parameters, partial atomic charges, nonbonded interactions, hydrogen bond interactions, and intrinsic torsional potentials for the naturally occurring amino acids , 1975 .

[13]  R. Jernigan,et al.  Inter-residue potentials in globular proteins and the dominance of highly specific hydrophilic interactions at close separation. , 1997, Journal of molecular biology.

[14]  Andrew E. Torda,et al.  The GROMOS biomolecular simulation program package , 1999 .

[15]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[16]  S. Bryant,et al.  The frequency of ion‐pair substructures in proteins is quantitatively related to electrostatic potential: A statistical model for nonbonded interactions , 1991, Proteins.

[17]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[18]  J Skolnick,et al.  Evaluation of atomic level mean force potentials via inverse folding and inverse refinement of protein structures: atomic burial position and pairwise non-bonded interactions. , 1996, Protein engineering.

[19]  Ron Elber,et al.  Maximum feasibility guideline in the design and analysis of protein folding potentials , 2002, J. Comput. Chem..

[20]  J L Klepeis,et al.  A new pairwise folding potential based on improved decoy generation and side‐chain packing , 2004, Proteins.

[21]  K. Wüthrich,et al.  Torsion angle dynamics for NMR structure calculation with the new program DYANA. , 1997, Journal of molecular biology.

[22]  H. Scheraga,et al.  Energy parameters in polypeptides. 10. Improved geometrical parameters and nonbonded interactions for use in the ECEPP/3 algorithm, with application to proline-containing peptides , 1994 .

[23]  A. Finkelstein,et al.  Why do protein architectures have boltzmann‐like statistics? , 1995, Proteins.

[24]  A. Liwo,et al.  United‐residue force field for off‐lattice protein‐structure simulations: III. Origin of backbone hydrogen‐bonding cooperativity in united‐residue potentials , 1998 .

[25]  R. Bruccoleri,et al.  Criteria that discriminate between native proteins and incorrectly folded models , 1988, Proteins.

[26]  R. A. Scott,et al.  Discriminating compact nonnative structures from the native structure of globular proteins. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Christodoulos A. Floudas,et al.  Advances in protein structure prediction and de novo protein design : A review , 2006 .

[28]  R. Elber,et al.  Distance‐dependent, pair potential for protein folding: Results from linear optimization , 2000, Proteins.

[29]  Adam Liwo,et al.  A united-residue force field for off-lattice protein-structure simulations. II. Parameterization of short-range interactions and determination of weights of energy terms by Z-score optimization , 1997, J. Comput. Chem..

[30]  James M. Fenton,et al.  A Knowledge-Based Method for Protein Structure Refinement and Prediction , 1996, ISMB.

[31]  Alexander D. MacKerell,et al.  All-atom empirical potential for molecular modeling and dynamics studies of proteins. , 1998, The journal of physical chemistry. B.

[32]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[33]  Ceslovas Venclovas,et al.  Assessment of progress over the CASP experiments , 2003, Proteins.

[34]  Adam Liwo,et al.  A united-residue force field for off-lattice protein-structure simulations. I. Functional forms and parameters of long-range side-chain interaction potentials from protein crystal data , 1997, J. Comput. Chem..