How to derive a protein folding potential? A new approach to an old problem.

In this paper we introduce a novel method of deriving a pairwise potential for protein folding. The potential is obtained by an optimization procedure that simultaneously maximizes thermodynamic stability for all proteins in the database. When applied to the representative dataset of proteins and with the energy function taken in pairwise contact approximation, our potential scored somewhat better than existing ones. However, the discrimination of the native structure from decoys is still not strong enough to make the potential useful for ab initio folding. Our results suggest that the problem lies with pairwise amino acid contact approximation and/or simplified presentation of proteins rather than with the derivation of potential. We argue that more detail of protein structure and energetics should be taken into account to achieve energy gaps. The suggested method is general enough to allow us to systematically derive parameters for more sophisticated energy functions. The internal control of validity for the potential derived by our method is convergence to a unique solution upon addition of new proteins to the database. The method is tested on simple model systems where sequences are designed, using the preset "true" potential, to have low energy in a dataset of structures. Our procedure is able to recover the potential with correlation r approximately 91% with the true one and we were able to fold all model structures using the recovered potential. Other statistical knowledge-based approaches were tested using this model and the results indicate that they also can recover the true potential with high degree of accuracy.

[1]  M. Levitt A simplified representation of protein conformations for rapid simulation of protein folding. , 1976, Journal of molecular biology.

[2]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[3]  Shoshana J. Wodak,et al.  Generating and testing protein folds , 1993 .

[4]  Eytan Domany,et al.  Protein fold recognition and dynamics in the space of contact maps , 1996, Proteins.

[5]  S. Wodak,et al.  Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches. , 1994, Journal of molecular biology.

[6]  E. Shakhnovich,et al.  Engineering of stable and fast-folding sequences of model proteins. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[7]  M. Sippl Calculation of conformational ensembles from potentials of mena force , 1990 .

[8]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[9]  N. Go,et al.  Studies on protein folding, unfolding, and fluctuations by computer simulation. II. A. Three‐dimensional lattice model of lysozyme , 1978 .

[10]  S. Wodak,et al.  Extracting information on folding from the amino acid sequence: accurate predictions for protein regions with preferred conformation in the absence of tertiary interactions. , 1992, Biochemistry.

[11]  A. Godzik,et al.  A general method for the prediction of the three dimensional structure and folding pathway of globular proteins: Application to designed helical proteins , 1993 .

[12]  Harold A. Scheraga,et al.  Optimizing Potential Functions for Protein Folding , 1996 .

[13]  M. Levitt,et al.  Energy functions that discriminate X-ray and near native folds from well-constructed decoys. , 1996, Journal of molecular biology.

[14]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[15]  E I Shakhnovich,et al.  A test of lattice protein folding algorithms. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[16]  E. Shakhnovich,et al.  Pseudodihedrals: Simplified protein backbone representation with knowledge‐based energy , 1994, Protein science : a publication of the Protein Society.

[17]  A. Finkelstein,et al.  Why do protein architectures have boltzmann‐like statistics? , 1995, Proteins.

[18]  A. Finkelstein,et al.  Why are the same protein folds used to perform different functions? , 1993, FEBS letters.

[19]  Karplus,et al.  Protein folding bottlenecks: A lattice Monte Carlo simulation. , 1991, Physical review letters.

[20]  H. Scheraga,et al.  Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. , 1976, Macromolecules.

[21]  P. Wolynes,et al.  Optimal protein-folding codes from spin-glass theory. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[22]  M. Karplus,et al.  Kinetics of protein folding. A lattice model study of the requirements for folding to the native state. , 1994, Journal of molecular biology.

[23]  J. Skolnick,et al.  Monte carlo simulations of protein folding. I. Lattice model and interaction scheme , 1994, Proteins.

[24]  E. Shakhnovich,et al.  Influence of point mutations on protein structure: probability of a neutral mutation. , 1991, Journal of theoretical biology.

[25]  T. Creighton Proteins: Structures and Molecular Properties , 1986 .

[26]  A Tropsha,et al.  A new approach to protein fold recognition based on Delaunay tessellation of protein structure. , 1997, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[27]  S. Wodak,et al.  Protein structure prediction by threading methods: Evaluation of current techniques , 1995, Proteins.

[28]  H. Scheraga,et al.  Conformational Energy Calculations on Polypeptides and Proteins , 1994 .

[29]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[30]  A V Finkelstein,et al.  Search for the most stable folds of protein chains: I. Application of a self-consistent molecular field theory to a problem of protein three-dimensional structure prediction. , 1996, Protein engineering.

[31]  E I Shakhnovich,et al.  Evolution-like selection of fast-folding model proteins. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[32]  S. Doniach,et al.  A computer model to dynamically simulate protein folding: Studies with crambin , 1989, Proteins.

[33]  E. Shakhnovich,et al.  A new approach to the design of stable proteins. , 1993, Protein engineering.

[34]  A. Godzik,et al.  Are proteins ideal mixtures of amino acids? Analysis of energy parameter sets , 1995, Protein science : a publication of the Protein Society.

[35]  A. Kolaskar,et al.  Empirical torsional potential functions from protein structure data. Phi- and psi-potentials for non-glycyl amino acid residues. , 2009, International journal of peptide and protein research.

[36]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[37]  J. Onuchic,et al.  Folding kinetics of proteinlike heteropolymers , 1994, cond-mat/9404001.

[38]  A. Elofsson,et al.  Local moves: An efficient algorithm for simulation of protein folding , 1995, Proteins.

[39]  A. Kolinski,et al.  Simulations of the Folding of a Globular Protein , 1990, Science.

[40]  M. Hao,et al.  How optimization of potential functions affects protein folding. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Joseph D. Bryngelson,et al.  When is a potential accurate enough for structure prediction? Theory and application to a random heteropolymer model of protein folding , 1994 .

[42]  K. Dill,et al.  Statistical potentials extracted from protein structures: how accurate are they? , 1996, Journal of molecular biology.

[43]  Y. Matsuo,et al.  Development of pseudoenergy potentials for assessing protein 3-D-1-D compatibility and detecting weak homologies. , 1993, Protein engineering.

[44]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[45]  Eugene I. Shakhnovich,et al.  Free energy landscape for protein folding kinetics: Intermediates, traps, and multiple pathways in theory and lattice model simulations , 1994 .

[46]  Alexei V. Finkelstein,et al.  A search for the most stable folds of protein chains , 1991, Nature.

[47]  G. Crippen,et al.  Contact potential that recognizes the correct folding of globular proteins. , 1992, Journal of molecular biology.

[48]  E I Shakhnovich,et al.  Impact of local and non-local interactions on thermodynamics and kinetics of protein folding. , 1995, Journal of molecular biology.

[49]  E. Shakhnovich,et al.  Proteins with selected sequences fold into unique native conformation. , 1994, Physical review letters.

[50]  M. Levitt,et al.  Exploring conformational space with a simple lattice model for protein structure. , 1994, Journal of molecular biology.

[51]  Vijay S. Pande,et al.  How accurate must potentials be for successful modeling of protein folding , 1995, cond-mat/9510123.