Ab initio construction of polypeptide fragments: Efficient generation of accurate, representative ensembles

We describe a novel method to generate ensembles of conformations of the main‐chain atoms {N, Cα, C, O, Cβ} for a sequence of amino acids within the context of a fixed protein framework. Each conformation satisfies fundamental stereo‐chemical restraints such as idealized geometry, favorable ϕ/ψ angles, and excluded volume. The ensembles include conformations both near and far from the native structure. Algorithms for effective conformational sampling and constant time overlap detection permit the generation of thousands of distinct conformations in minutes. Unlike previous approaches, our method samples dihedral angles from fine‐grained ϕ/ψ state sets, which we demonstrate is superior to exhaustive enumeration from coarse ϕ/ψ sets. Applied to a large set of loop structures, our method samples consistently near‐native conformations, averaging 0.4, 1.1, and 2.2 Å main‐chain root‐mean‐square deviations for four, eight, and twelve residue long loops, respectively. The ensembles make ideal decoy sets to assess the discriminatory power of a selection method. Using these decoy sets, we conclude that quality of anchor geometry cannot reliably identify near‐native conformations, though the selection results are comparable to previous loop prediction methods. In a subsequent study (de Bakker et al.: Proteins 2003;51:21–40), we demonstrate that the AMBER forcefield with the Generalized Born solvation model identifies near‐native conformations significantly better than previous methods. Proteins 2003;51:41–55. © 2003 Wiley‐Liss, Inc.

[1]  G. N. Ramachandran,et al.  Conformation of polypeptides and proteins. , 1968, Advances in protein chemistry.

[2]  N. Go,et al.  Ring Closure and Local Conformational Deformations of Chain Molecules , 1970 .

[3]  M. Karplus,et al.  An analysis of incorrectly folded protein models. Implications for structure predictions. , 1984, Journal of molecular biology.

[4]  B. L. Sibanda,et al.  Beta-hairpin families in globular proteins. , 1985, Nature.

[5]  B. L. Sibanda,et al.  β-Hairpin families in globular proteins , 1985, Nature.

[6]  W. Kabsch,et al.  Identical pentapeptides with different backbones , 1985, Nature.

[7]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[8]  J. Moult,et al.  An algorithm for determining the conformation of polypeptide segments in proteins by systematic search , 1986, Proteins.

[9]  M. Karplus,et al.  Prediction of the folding of short polypeptide segments by uniform conformational sampling , 1987, Biopolymers.

[10]  John P. Overington,et al.  18th Sir Hans Krebs lecture. Knowledge-based protein modelling and design. , 1988, European journal of biochemistry.

[11]  John P. Overington,et al.  Knowledge‐based protein modelling and design , 1988 .

[12]  A. Lesk,et al.  Conformations of immunoglobulin hypervariable regions , 1989, Nature.

[13]  B. L. Sibanda,et al.  Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. , 1989, Journal of molecular biology.

[14]  M. Karplus,et al.  Conformational sampling using high‐temperature molecular dynamics , 1990, Biopolymers.

[15]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[16]  R. Huber,et al.  Accurate Bond and Angle Parameters for X-ray Protein Structure Refinement , 1991 .

[17]  C. Sander,et al.  Database algorithm for generating protein backbone and side-chain co-ordinates from a C alpha trace application to model building and detection of co-ordinate errors. , 1991, Journal of molecular biology.

[18]  S. Wodak,et al.  Prediction of protein backbone conformation based on seven structure assignments. Influence of local interactions. , 1991, Journal of molecular biology.

[19]  M J Sippl,et al.  Assembly of polypeptide and protein backbone conformations from low energy ensembles of short fragments: Development of strategies and construction of models for myoglobin, lysozyme, and thymosin β4 , 1992, Protein science : a publication of the Protein Society.

[20]  C. Sander,et al.  Evaluation of protein models by atomic solvation preference. , 1992, Journal of molecular biology.

[21]  J Moult,et al.  Fitting electron density by systematic search. , 1992, Acta crystallographica. Section A, Foundations of crystallography.

[22]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[23]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[24]  Scott R. Presnell,et al.  Origins of structural diversity within sequentially identical hexapeptides , 1993, Protein science : a publication of the Protein Society.

[25]  J. Garnier,et al.  Modeling of protein loops by simulated annealing , 1993, Protein science : a publication of the Protein Society.

[26]  M. Levitt,et al.  Exploring conformational space with a simple lattice model for protein structure. , 1994, Journal of molecular biology.

[27]  R. Abagyan,et al.  Biased probability Monte Carlo conformational searches and electrostatic calculations for peptides and proteins. , 1994, Journal of molecular biology.

[28]  K. Fidelis,et al.  Comparison of systematic search and database methods for constructing segments of protein structure. , 1994, Protein engineering.

[29]  U. Hobohm,et al.  Enlarged representative set of protein structures , 1994, Protein science : a publication of the Protein Society.

[30]  B Honig,et al.  An algorithm to generate low-resolution protein tertiary structures from knowledge of secondary structure. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[31]  M. Swindells,et al.  Intrinsic phi, psi propensities of amino acids, derived from the coil regions of known structures. , 1995, Nature structural biology.

[32]  R. A. Scott,et al.  Discriminating compact nonnative structures from the native structure of globular proteins. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[33]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules J. Am. Chem. Soc. 1995, 117, 5179−5197 , 1996 .

[34]  M. Levitt,et al.  The complexity and accuracy of discrete state models of protein structure. , 1995, Journal of molecular biology.

[35]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[36]  M. Swindells,et al.  Intrinsic φ,ψ propensities of amino acids, derived from the coil regions of known structures , 1995, Nature Structural Biology.

[37]  M. Levitt,et al.  Energy functions that discriminate X-ray and near native folds from well-constructed decoys. , 1996, Journal of molecular biology.

[38]  C. Sander,et al.  Errors in protein structures , 1996, Nature.

[39]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[40]  J Moult,et al.  Comparison of database potentials and molecular mechanics force fields. , 1997, Current opinion in structural biology.

[41]  T. Blundell,et al.  Predicting the conformational class of short and medium size loops connecting regular secondary structures: application to comparative modelling. , 1997, Journal of molecular biology.

[42]  M. Karplus,et al.  PDB-based protein loop prediction: parameters for selection and methods for optimization. , 1997, Journal of molecular biology.

[43]  Luhua Lai,et al.  A fast and efficient program for modeling protein loops , 1997 .

[44]  P. S. Kim,et al.  High-resolution protein design with backbone freedom. , 1998, Science.

[45]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[46]  Alexander D. MacKerell,et al.  All-atom empirical potential for molecular modeling and dynamics studies of proteins. , 1998, The journal of physical chemistry. B.

[47]  J. Wójcik,et al.  New efficient statistical sequence-dependent structure prediction of short to medium-sized protein loops based on an exhaustive loop classification. , 1999, Journal of molecular biology.

[48]  Harold A. Scheraga,et al.  Exact analytical loop closure in proteins using polynomial equations , 1999, J. Comput. Chem..

[49]  A Rojnuckarin,et al.  Knowledge‐based interaction potentials for proteins , 1999, Proteins.

[50]  M. Zalis,et al.  Visualizing and quantifying molecular goodness-of-fit: small-probe contact dots with explicit hydrogen atoms. , 1999, Journal of molecular biology.

[51]  C. Deane,et al.  A novel exhaustive search algorithm for predicting the conformation of polypeptide segments in proteins , 2000, Proteins.

[52]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[53]  J. Richardson,et al.  The penultimate rotamer library , 2000, Proteins.

[54]  V. Thorsson,et al.  HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. , 2000, Journal of molecular biology.

[55]  A. Sali,et al.  Modeling of loops in protein structures , 2000, Protein science : a publication of the Protein Society.

[56]  R Samudrala,et al.  Decoys ‘R’ Us: A database of incorrect conformations to improve protein structure prediction , 2000, Protein science : a publication of the Protein Society.

[57]  C M Deane,et al.  Improved protein loop prediction from sequence alone. , 2001, Protein engineering.

[58]  C. Deane,et al.  CODA: A combined algorithm for predicting the structurally variable regions of protein models , 2001, Protein science : a publication of the Protein Society.

[59]  Charles L. Brooks,et al.  Identifying native‐like protein structures using physics‐based potentials , 2002, J. Comput. Chem..

[60]  M. DePristo,et al.  Ab initio construction of polypeptide fragments: Accuracy of loop decoy discrimination by an all‐atom statistical potential and the AMBER force field with the Generalized Born solvation model , 2003, Proteins.