A distance‐dependent atomic knowledge‐based potential for improved protein structure selection

A heavy atom distance‐dependent knowledge‐based pairwise potential has been developed. This statistical potential is first evaluated and optimized with the native structure z‐scores from gapless threading. The potential is then used to recognize the native and near‐native structures from both published decoy test sets, as well as decoys obtained from our group's protein structure prediction program. In the gapless threading test, there is an average z‐score improvement of 4 units in the optimized atomic potential over the residue‐based quasichemical potential. Examination of the z‐scores for individual pairwise distance shells indicates that the specificity for the native protein structure is greatest at pairwise distances of 3.5–6.5 Å, i.e., in the first solvation shell. On applying the current atomic potential to test sets obtained from the web, composed of native protein and decoy structures, the current generation of the potential performs better than residue‐based potentials as well as the other published atomic potentials in the task of selecting native and near‐native structures. This newly developed potential is also applied to structures of varying quality generated by our group's protein structure prediction program. The current atomic potential tends to pick lower RMSD structures than do residue‐based contact potentials. In particular, this atomic pairwise interaction potential has better selectivity especially for near‐native structures. As such, it can be used to select near‐native folds generated by structure prediction algorithms as well as for protein structure refinement. Proteins 2001;44:223–232. © 2001 Wiley‐Liss, Inc.

[1]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[2]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[3]  D. Covell Folding protein α‐carbon chains into compact forms by monte carlo methods , 1992 .

[4]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[5]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[6]  A. Godzik,et al.  Topology fingerprint approach to the inverse protein folding problem. , 1992, Journal of molecular biology.

[7]  T. Salakoski,et al.  Selection of a representative set of structures from brookhaven protein data bank , 1992, Proteins.

[8]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[9]  S. Sun,et al.  Reduced representation model of protein structure prediction: Statistical potential and genetic algorithms , 1993, Protein science : a publication of the Protein Society.

[10]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[11]  R. Friesner,et al.  Computer modeling of protein folding: conformational and energetic analysis of reduced and detailed protein models. , 1995, Journal of molecular biology.

[12]  L A Mirny,et al.  How to derive a protein folding potential? A new approach to an old problem. , 1996, Journal of molecular biology.

[13]  M. Levitt,et al.  Energy functions that discriminate X-ray and near native folds from well-constructed decoys. , 1996, Journal of molecular biology.

[14]  M. Sippl,et al.  Helmholtz free energies of atom pair interactions in proteins. , 1996, Folding & design.

[15]  K. Dill,et al.  Statistical potentials extracted from protein structures: how accurate are they? , 1996, Journal of molecular biology.

[16]  A Elofsson,et al.  Assessing the performance of fold recognition methods by means of a comprehensive benchmark. , 1996, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[17]  R. Jernigan,et al.  Structure-derived potentials and protein simulations. , 1996, Current opinion in structural biology.

[18]  P. Wolynes,et al.  Self‐consistently optimized statistical mechanical energy functions for sequence structure alignment , 1996, Protein science : a publication of the Protein Society.

[19]  J Skolnick,et al.  Evaluation of atomic level mean force potentials via inverse folding and inverse refinement of protein structures: atomic burial position and pairwise non-bonded interactions. , 1996, Protein engineering.

[20]  J. Skolnick,et al.  MONSSTER: a method for folding globular proteins with a small number of distance restraints. , 1997, Journal of molecular biology.

[21]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[22]  J Moult,et al.  Comparison of database potentials and molecular mechanics force fields. , 1997, Current opinion in structural biology.

[23]  S Vajda,et al.  Empirical potentials and functions for protein folding and binding. , 1997, Current opinion in structural biology.

[24]  F. Melo,et al.  Novel knowledge-based mean force potential at atomic level. , 1997, Journal of molecular biology.

[25]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[26]  J. Skolnick,et al.  An Efficient Monte Carlo Model of Protein Chains. Modeling the Short-Range Correlations between Side Group Centers of Mass , 1998 .

[27]  P. Kollman,et al.  Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution. , 1998, Science.

[28]  Alexei V. Finkelstein,et al.  3D Protein Folds: Homologs Against Errors-a Simple Estimate Based on the Random Energy Model , 1998 .

[29]  R. Jernigan,et al.  An empirical energy potential with a reference state for protein fold and sequence recognition , 1999, Proteins.

[30]  M. Hao,et al.  Designing potential energy functions for protein folding. , 1999, Current opinion in structural biology.

[31]  D. Thirumalai,et al.  Pair potentials for protein folding: Choice of reference states and sensitivity of predicted native states to variations in the interaction schemes , 2008, Protein science : a publication of the Protein Society.

[32]  D. Baker,et al.  Improved recognition of native‐like protein structures using a combination of sequence‐dependent and sequence‐independent features of proteins , 1999, Proteins.

[33]  J. Skolnick,et al.  Averaging interaction energies over homologs improves protein fold recognition in gapless threading , 1999, Proteins.

[34]  P Rotkiewicz,et al.  A method for the improvement of threading‐based protein models , 1999, Proteins.

[35]  A Kolinski,et al.  Correlation between knowledge‐based and detailed atomic potentials: Application to the unfolding of the GCN4 leucine zipper , 1999, Proteins.

[36]  M. Karplus,et al.  Effective energy functions for protein structure prediction. , 2000, Current opinion in structural biology.

[37]  M Feig,et al.  Accurate reconstruction of all‐atom protein representations from side‐chain‐based low‐resolution models , 2000, Proteins.

[38]  R. Elber,et al.  Distance‐dependent, pair potential for protein folding: Results from linear optimization , 2000, Proteins.

[39]  A. Kolinski,et al.  Derivation of protein‐specific pair potentials based on weak sequence fragment similarity , 2000, Proteins.

[40]  J Skolnick,et al.  Defrosting the frozen approximation: PROSPECTOR— A new approach to threading , 2001, Proteins.

[41]  P Rotkiewicz,et al.  Generalized comparative modeling (GENECOMP): A combination of sequence comparison, threading, and lattice modeling for protein structure prediction and refinement , 2001, Proteins.