Protein refolding in silico with atom-based statistical potentials and conformational search using a simple genetic algorithm.

A distance-dependent atom-pair potential that treats long range and local interactions separately has been developed and optimized to distinguish native protein structures from sets of incorrect or decoy structures. Atoms are divided into 30 types based on chemical properties and relative position in the amino acid side-chains. Several parameters affecting the calculation and evaluation of this statistical potential, such as the reference state, the bin width, cutoff distances between pairs, and the number of residues separating the atom pairs, are adjusted to achieve the best discrimination. The native structure has the lowest energy for 39 of the 40 sets of original ROSETTA decoys (1000 structures per set) and 23 of the 25 improved decoys (approximately 1900 structures per set). Combined with the orientation-dependent backbone hydrogen bonding potential used by ROSETTA and a statistical solvation potential based on the solvent exclusion model of Lazaridis & Karplus, this potential is used as a scoring function for conformational search based on a genetic algorithm method. After unfolding the native structure by changing every phi and psi angle by either +/-3, +/-5 or +/-7 degrees, five small proteins can be efficiently refolded, in some cases to within 0.5 A C(alpha) distance matrix error (DME) to the native state. Although no significant correlation is found between the total energy and structural similarity to the native state, a surprisingly strong correlation exists between the radius of gyration and the DME for low energy structures.

[1]  M J Sippl,et al.  Knowledge-based potentials for proteins. , 1995, Current opinion in structural biology.

[2]  D. Shortle Composites of local structure propensities: evidence for local encoding of long-range structure. , 2002, Protein science : a publication of the Protein Society.

[3]  J Moult,et al.  Genetic algorithms for protein structure prediction. , 1996, Current opinion in structural biology.

[4]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[5]  Anthony K. Felts,et al.  Distinguishing native conformations of proteins from decoys with an effective free energy estimator based on the OPLS all‐atom force field and the surface generalized born solvent model , 2002, Proteins.

[6]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[7]  H A Scheraga,et al.  Improved genetic algorithm for the protein folding problem by use of a Cartesian combination operator , 1996, Protein science : a publication of the Protein Society.

[8]  M. Karplus,et al.  Discrimination of the native from misfolded protein models with an energy function including implicit solvation. , 1999, Journal of molecular biology.

[9]  R L Jernigan,et al.  Short‐range conformational energies, secondary structure propensities, and recognition of correct sequence‐structure matches , 1997, Proteins.

[10]  Qiaojun Fang,et al.  Enhanced sampling near the native conformation using statistical potentials for local side‐chain and backbone interactions , 2005, Proteins.

[11]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[12]  J. Richardson,et al.  The penultimate rotamer library , 2000, Proteins.

[13]  F. Melo,et al.  Novel knowledge-based mean force potential at atomic level. , 1997, Journal of molecular biology.

[14]  K. Misura,et al.  PROTEINS: Structure, Function, and Bioinformatics 59:15–29 (2005) Progress and Challenges in High-Resolution Refinement of Protein Structure Models , 2022 .

[15]  D. Shortle Propensities, probabilities, and the Boltzmann hypothesis , 2003, Protein science : a publication of the Protein Society.

[16]  D. Eisenberg,et al.  An evolutionary approach to folding small alpha-helical proteins that uses sequence information and an empirical guiding fitness function. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Qiaojun Fang,et al.  A consistent set of statistical potentials for quantifying local side‐chain and backbone interactions , 2005, Proteins.

[18]  R Nussinov,et al.  A set of van der Waals and coulombic radii of protein atoms for molecular and solvent‐accessible surface calculation, packing evaluation, and docking , 1998, Proteins.

[19]  R Unger,et al.  Genetic algorithms for protein folding simulations. , 1992, Journal of molecular biology.

[20]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[21]  Hongyi Zhou,et al.  An accurate, residue‐level, pair potential of mean force for folding and binding based on the distance‐scaled, ideal‐gas reference state , 2004, Protein science : a publication of the Protein Society.

[22]  J. Skolnick,et al.  A distance‐dependent atomic knowledge‐based potential for improved protein structure selection , 2001, Proteins.

[23]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[24]  J Moult,et al.  Protein folding simulations with genetic algorithms and a detailed molecular description. , 1997, Journal of molecular biology.

[25]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[26]  D. Shortle,et al.  Prediction of protein structure by emphasizing local side‐chain/backbone interactions in ensembles of turn fragments , 2003, Proteins.

[27]  S Vajda,et al.  Empirical potentials and functions for protein folding and binding. , 1997, Current opinion in structural biology.

[28]  R. Jernigan,et al.  Structure-derived potentials and protein simulations. , 1996, Current opinion in structural biology.

[29]  A. Elofsson,et al.  Local moves: An efficient algorithm for simulation of protein folding , 1995, Proteins.

[30]  J Moult,et al.  Comparison of database potentials and molecular mechanics force fields. , 1997, Current opinion in structural biology.

[31]  Manfred J. Sippl,et al.  Boltzmann's principle, knowledge-based mean fields and protein folding. An approach to the computational determination of protein structures , 1993, J. Comput. Aided Mol. Des..

[32]  Richard Bonneau,et al.  An improved protein decoy set for testing energy functions for protein structure prediction , 2003, Proteins.

[33]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[34]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[35]  D. Baker,et al.  An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. , 2003, Journal of molecular biology.