A free-rotating and self-avoiding chain model for deriving statistical potentials based on protein structures.

Statistical potentials have been widely used in protein studies despite the much-debated theoretical basis. In this work, we have applied two physical reference states for deriving the statistical potentials based on protein structure features to achieve zero interaction and orthogonalization. The free-rotating chain-based potential applies a local free-rotating chain reference state, which could theoretically be described by the Gaussian distribution. The self-avoiding chain-based potential applies a reference state derived from a database of artificial self-avoiding backbones generated by Monte Carlo simulation. These physical reference states are independent of known protein structures and are based solely on the analytical formulation or simulation method. The new potentials performed better and yielded higher Z-scores and success rates compared to other statistical potentials. The end-to-end distance distribution produced by the self-avoiding chain model was similar to the distance distribution of protein atoms in structure database. This fact may partly explain the basis of the reference states that depend on the atom pair frequency observed in the protein database. The current study showed that a more physical reference model improved the performance of statistical potentials in protein fold recognition, which could also be extended to other types of applications.

[1]  C. Sander,et al.  Fast and simple monte carlo algorithm for side chain optimization in proteins: Application to model building by homology , 1992, Proteins.

[2]  J. Skolnick,et al.  A distance‐dependent atomic knowledge‐based potential for improved protein structure selection , 2001, Proteins.

[3]  M. Levitt,et al.  Energy functions that discriminate X-ray and near native folds from well-constructed decoys. , 1996, Journal of molecular biology.

[4]  M. James,et al.  A critical assessment of comparative molecular modeling of tertiary structures of proteins * , 1995, Proteins.

[5]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[6]  B. Honig,et al.  Free energy determinants of tertiary structure and the evaluation of protein models , 2000, Protein science : a publication of the Protein Society.

[7]  D. Thirumalai,et al.  Pair potentials for protein folding: Choice of reference states and sensitivity of predicted native states to variations in the interaction schemes , 2008, Protein science : a publication of the Protein Society.

[8]  B. McConkey,et al.  Discrimination of native protein structures using atom–atom contact scoring , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  A. Ben-Naim STATISTICAL POTENTIALS EXTRACTED FROM PROTEIN STRUCTURES : ARE THESE MEANINGFUL POTENTIALS? , 1997 .

[10]  Ram Samudrala,et al.  A knowledge-based scoring function based on residue triplets for protein structure prediction , 2006, Protein engineering, design & selection : PEDS.

[11]  S. Sun,et al.  Reduced representation model of protein structure prediction: Statistical potential and genetic algorithms , 1993, Protein science : a publication of the Protein Society.

[12]  J Skolnick,et al.  Evaluation of atomic level mean force potentials via inverse folding and inverse refinement of protein structures: atomic burial position and pairwise non-bonded interactions. , 1996, Protein engineering.

[13]  Yaoqi Zhou,et al.  Docking prediction using biological information, ZDOCK sampling technique, and clustering guided by the DFIRE statistical energy function , 2005, Proteins.

[14]  D. Eisenberg,et al.  Three-dimensional profiles from residue-pair preferences: identification of sequences with beta/alpha-barrel fold. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[15]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[16]  A. Kolinski,et al.  Derivation of protein‐specific pair potentials based on weak sequence fragment similarity , 2000, Proteins.

[17]  A. Godzik,et al.  Derivation and testing of pair potentials for protein folding. When is the quasichemical approximation correct? , 1997, Protein science : a publication of the Protein Society.

[18]  A Rojnuckarin,et al.  Knowledge‐based interaction potentials for proteins , 1999, Proteins.

[19]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[20]  J. Straub,et al.  Orientational potentials extracted from protein structures improve native fold recognition , 2004, Protein science : a publication of the Protein Society.

[21]  Hongyi Zhou,et al.  What is a desirable statistical energy functions for proteins and how can it be obtained? , 2007, Cell Biochemistry and Biophysics.

[22]  C. DeLisi,et al.  Determination of atomic desolvation energies from the structures of crystallized proteins. , 1997, Journal of molecular biology.

[23]  E. Domany,et al.  Pairwise contact potentials are unsuitable for protein folding , 1998 .

[24]  Fenglou Mao,et al.  Potential of mean force for protein–protein interaction studies , 2002, Proteins.

[25]  Eugene I. Shakhnovich,et al.  A structure-based method for derivation of all-atom potentials for protein folding , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[26]  S H Kim,et al.  Environment-dependent residue contact energies for proteins. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[27]  M. Levitt,et al.  A novel approach to decoy set generation: designing a physical energy function having local minima with native structure characteristics. , 2003, Journal of molecular biology.

[28]  Jian Qiu,et al.  Atomically detailed potentials to recognize native and approximate protein structures , 2005, Proteins.

[29]  William W. Chen,et al.  Fold recognition with minimal gaps , 2003, Proteins.

[30]  R. Srinivasan,et al.  The Flory isolated-pair hypothesis is not valid for polypeptide chains: implications for protein folding. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[31]  J. Skolnick,et al.  MONSSTER: a method for folding globular proteins with a small number of distance restraints. , 1997, Journal of molecular biology.

[32]  R. Jernigan,et al.  An empirical energy potential with a reference state for protein fold and sequence recognition , 1999, Proteins.

[33]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[34]  D Gilis,et al.  A new generation of statistical potentials for proteins. , 2006, Biophysical journal.

[35]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[36]  Hongyi Zhou,et al.  A physical reference state unifies the structure‐derived potential of mean force for protein folding and binding , 2004, Proteins.

[37]  R Samudrala,et al.  Ab initio construction of protein tertiary structures using a hierarchical approach. , 2000, Journal of molecular biology.

[38]  R. Elber,et al.  Distance‐dependent, pair potential for protein folding: Results from linear optimization , 2000, Proteins.

[39]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[40]  K. Dill,et al.  Statistical potentials extracted from protein structures: how accurate are they? , 1996, Journal of molecular biology.

[41]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[42]  R. Jernigan,et al.  Structure-derived potentials and protein simulations. , 1996, Current opinion in structural biology.

[43]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[44]  G. Casari,et al.  Identification of native protein folds amongst a large number of incorrect models. The calculation of low energy conformations from potentials of mean force. , 1990, Journal of molecular biology.

[45]  D Thirumalai,et al.  Development of novel statistical potentials for protein fold recognition. , 2004, Current opinion in structural biology.

[46]  S Doniach,et al.  Computer simulation of antibody binding specificity , 1993, Proteins.

[47]  M. Karplus,et al.  Effective energy functions for protein structure prediction. , 2000, Current opinion in structural biology.

[48]  J Moult,et al.  Molecular dynamics study of the structure and dynamics of a protein molecule in a crystalline ionic environment, Streptomyces griseus protease A. , 1990, Biochemistry.

[49]  M J Sippl,et al.  Knowledge-based potentials for proteins. , 1995, Current opinion in structural biology.

[50]  D. Thirumalai,et al.  Anisotropic coarse-grained statistical potentials improve the ability to identify nativelike protein structures , 2003, physics/0302009.

[51]  A. Liwo,et al.  Energy-based de novo protein folding by conformational space annealing and an off-lattice united-residue force field: application to the 10-55 fragment of staphylococcal protein A and to apo calbindin D9K. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[52]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[53]  Hongyi Zhou,et al.  An accurate, residue‐level, pair potential of mean force for folding and binding based on the distance‐scaled, ideal‐gas reference state , 2004, Protein science : a publication of the Protein Society.

[54]  A. Sali,et al.  Statistical potentials for fold assessment , 2009 .

[55]  J Moult,et al.  Protein folding simulations with genetic algorithms and a detailed molecular description. , 1997, Journal of molecular biology.