A reduced protein model with accurate native‐structure identification ability

A protein model that is simple enough to be used in protein‐folding simulations but accurate enough to identify a protein native fold is described. Its geometry consists of describing the residues by one, two, or three pseudoatoms, depending on the residue size. Its energy is given by a pairwise, knowledge‐based potential obtained for all the pseudoatoms as a function of their relative distance. The pseudoatomic potential is also a function of the primary chain separation and residue order. The model is tested by gapless threading on a large, representative set of known protein and decoy structures obtained from the “Decoys ‘R’ Us” database. It is also tested by threading on gapped decoys generated for proteins with many homologs. The gapless threading tests show near 98% native‐structure recognition as the lowest energy structure and almost 100% as one of the three lowest energy structures for over 2200 test proteins. In decoy threading tests, the model recognized the majority of the native structures. It is also able to recognize native structures among gapped decoys, in spite of close structural similarities. The results indicate that the pseudoatomic model has native recognition ability similar to comparable atomic‐based models but much better than equivalent residue‐based models. Proteins 2003. © 2003 Wiley‐Liss, Inc.

[1]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[2]  J Skolnick,et al.  Universal similarity measure for comparing protein structures. , 2001, Biopolymers.

[3]  R. Elber,et al.  Distance‐dependent, pair potential for protein folding: Results from linear optimization , 2000, Proteins.

[4]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[5]  M. Betancourt Smoothing the landscapes of protein folding: Insights from a minimal model , 1998 .

[6]  R Samudrala,et al.  Decoys ‘R’ Us: A database of incorrect conformations to improve protein structure prediction , 2000, Protein science : a publication of the Protein Society.

[7]  W. L. Jorgensen,et al.  The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. , 1988, Journal of the American Chemical Society.

[8]  E. Lattman,et al.  Rapid calculation of the solution scattering profile from a macromolecule of known structure , 1989, Proteins.

[9]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[10]  E. Shakhnovich,et al.  Engineering of stable and fast-folding sequences of model proteins. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[11]  J. Onuchic,et al.  Theory of protein folding: the energy landscape perspective. , 1997, Annual review of physical chemistry.

[12]  Norman L. Allinger,et al.  Conformational analysis. 130. MM2. A hydrocarbon force field utilizing V1 and V2 torsional terms , 1977 .

[13]  K. Dill,et al.  Transition states and folding dynamics of proteins and heteropolymers , 1994 .

[14]  P. Kollman,et al.  An all atom force field for simulations of proteins and nucleic acids , 1986, Journal of computational chemistry.

[15]  P. Kollman,et al.  Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution. , 1998, Science.

[16]  William L. Jorgensen,et al.  Molecular dynamics of proteins with the OPLS potential functions. Simulation of the third domain of silver pheasant ovomucoid in water , 1990 .

[17]  D. Yee,et al.  Principles of protein folding — A perspective from simple exact models , 1995, Protein science : a publication of the Protein Society.

[18]  P Fariselli,et al.  Progress in predicting inter‐residue contacts of proteins with neural networks and correlated mutations , 2001, Proteins.

[19]  D Thirumalai,et al.  Kinetics and thermodynamics of folding in model proteins. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Yong Duan,et al.  Computational protein folding: From lattice to all-atom , 2001, IBM Syst. J..

[21]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[22]  N. Linial,et al.  On the design and analysis of protein folding potentials , 2000, Proteins.

[23]  L. Serrano,et al.  Reading protein sequences backwards. , 1998, Folding & design.

[24]  V A Eyrich,et al.  Protein structure prediction using a combination of sequence‐based alignment, constrained energy minimization, and structural alignment , 2001, Proteins.

[25]  J. Skolnick,et al.  A distance‐dependent atomic knowledge‐based potential for improved protein structure selection , 2001, Proteins.

[26]  R. Jernigan,et al.  Inter-residue potentials in globular proteins and the dominance of highly specific hydrophilic interactions at close separation. , 1997, Journal of molecular biology.

[27]  N. Go Theoretical studies of protein folding. , 1983, Annual review of biophysics and bioengineering.

[28]  H. Scheraga,et al.  Energy parameters in polypeptides. 10. Improved geometrical parameters and nonbonded interactions for use in the ECEPP/3 algorithm, with application to proline-containing peptides , 1994 .

[29]  J. Skolnick,et al.  Ab initio protein structure prediction via a combination of threading, lattice folding, clustering, and structure refinement , 2001, Proteins.

[30]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[31]  Harold A. Scheraga,et al.  Intermolecular potentials from crystal data. III. Determination of empirical potentials and application to the packing configurations and lattice energies in crystals of hydrocarbons, carboxylic acids, amines, and amides , 1974 .

[32]  David C. Jones Predicting novel protein folds by using FRAGFOLD , 2001, Proteins.

[33]  A. Liwo,et al.  Energy-based de novo protein folding by conformational space annealing and an off-lattice united-residue force field: application to the 10-55 fragment of staphylococcal protein A and to apo calbindin D9K. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[34]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[35]  U. Singh,et al.  A NEW FORCE FIELD FOR MOLECULAR MECHANICAL SIMULATION OF NUCLEIC ACIDS AND PROTEINS , 1984 .

[36]  D Xu,et al.  Application of PROSPECT in CASP4: Characterizing protein structures with new folds , 2001, Proteins.

[37]  H. Scheraga,et al.  Energy parameters in polypeptides. 9. Updating of geometrical parameters, nonbonded interactions, and hydrogen bond interactions for the naturally occurring amino acids , 1983 .

[38]  D. Thirumalai,et al.  Pair potentials for protein folding: Choice of reference states and sensitivity of predicted native states to variations in the interaction schemes , 2008, Protein science : a publication of the Protein Society.

[39]  M. Levitt Protein folding by restrained energy minimization and molecular dynamics. , 1983, Journal of molecular biology.

[40]  Adam Liwo,et al.  Prediction of protein structure using a knowledge-based off-lattice united-residue force field and global optimization methods , 1999 .

[41]  Richard Bonneau,et al.  Rosetta in CASP4: Progress in ab initio protein structure prediction , 2001, Proteins.

[42]  Norman L. Allinger,et al.  Molecular mechanics. The MM3 force field for hydrocarbons. 3. The van der Waals' potentials and crystal data for aliphatic and aromatic hydrocarbons , 1989 .

[43]  M. Levitt,et al.  Realistic simulations of native-protein dynamics in solution and beyond. , 1993, Annual review of biophysics and biomolecular structure.

[44]  D. Thirumalai,et al.  Kinetics and thermodynamics of folding of a de novo designed four-helix bundle protein. , 1996, Journal of molecular biology.

[45]  M. Levitt,et al.  Computer simulation of protein folding , 1975, Nature.