A novel exhaustive search algorithm for predicting the conformation of polypeptide segments in proteins

We present a fast ab initio method for the prediction of local conformations in proteins. The program, PETRA, selects polypeptide fragments from a computer‐generated database (APD) encoding all possible peptide fragments up to twelve amino acids long. Each fragment is defined by a representative set of eight ϕ/ψ pairs, obtained iteratively from a trial set by calculating how fragments generated from them represent the protein databank (PDB). Ninety‐six percent (96%) of length five fragments in crystal structures, with a resolution better than 1.5 Å and less than 25% identity, have a conformer in the database with less than 1 Å root‐mean‐square deviation (rmsd). In order to select segments from APD, PETRA uses a set of simple rule‐based filters, thus reducing the number of potential conformations to a manageable total. This reduced set is scored and sorted using rmsd fit to the anchor regions and a knowledge‐based energy function dependent on the sequence to be modelled. The best scoring fragments can then be optimized by minimization of contact potentials and rmsd fit to the core model. The quality of the prediction made by PETRA is evaluated by calculating both the differences in rmsd and backbone torsion angles between the final model and the native fragment. The average rmsd ranges from 1.4 Å for three residue loops to 3.9 Å for eight residue loops. Proteins 2000;40:135–144. © 2000 Wiley‐Liss, Inc.

[1]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[2]  B. L. Sibanda,et al.  Beta-hairpin families in globular proteins. , 1985, Nature.

[3]  J. Wójcik,et al.  New efficient statistical sequence-dependent structure prediction of short to medium-sized protein loops based on an exhaustive loop classification. , 1999, Journal of molecular biology.

[4]  M. Swindells,et al.  Intrinsic phi, psi propensities of amino acids, derived from the coil regions of known structures. , 1995, Nature structural biology.

[5]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[6]  J. Thornton,et al.  Stereochemical quality of protein structure coordinates , 1992, Proteins.

[7]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[8]  T. Blundell,et al.  Knowledge based modelling of homologous proteins, Part I: Three-dimensional frameworks derived from the simultaneous superposition of multiple structures. , 1987, Protein engineering.

[9]  S. Kearsley On the orthogonal transformation used for structural comparisons , 1989 .

[10]  D J Kyle,et al.  Accuracy and reliability of the scaling‐relaxation method for loop closure: An evaluation based on extensive and multiple copy conformational samplings , 1996, Proteins.

[11]  K. Fidelis,et al.  Comparison of systematic search and database methods for constructing segments of protein structure. , 1994, Protein engineering.

[12]  B. Lee,et al.  The interpretation of protein structures: estimation of static accessibility. , 1971, Journal of molecular biology.

[13]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[14]  E A Merritt,et al.  Raster3D Version 2.0. A program for photorealistic molecular graphics. , 1994, Acta crystallographica. Section D, Biological crystallography.

[15]  A. Lesk,et al.  Common features of the conformations of antigen‐binding loops in immunoglobulins and application to modeling loop conformations , 1992, Proteins.

[16]  T. Blundell,et al.  Conformational analysis and clustering of short and medium size loops connecting regular secondary structures: A database for modeling and prediction , 1996, Protein science : a publication of the Protein Society.

[17]  G. N. Ramachandran,et al.  Conformation of polypeptides and proteins. , 1968, Advances in protein chemistry.

[18]  S. Wodak,et al.  Automatic classification and analysis of alpha alpha-turn motifs in proteins. , 1996, Journal of molecular biology.

[19]  T. Blundell,et al.  Knowledge-based protein modeling. , 1994, Critical reviews in biochemistry and molecular biology.

[20]  P. Munson,et al.  Linkers of secondary structures in proteins , 1997, Protein science : a publication of the Protein Society.

[21]  J. Kwasigroch,et al.  A global taxonomy of loops in globular proteins. , 1996, Journal of molecular biology.

[22]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[23]  Roland L. Dunbrack,et al.  Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool. , 1997, Journal of molecular biology.

[24]  U. Hobohm,et al.  Enlarged representative set of protein structures , 1994, Protein science : a publication of the Protein Society.

[25]  Andrew J. Martin,et al.  Structural families in loops of homologous proteins: automatic classification, modelling and application to antibodies. , 1996, Journal of molecular biology.

[26]  BoZhi Jiang,et al.  Patterns and conformations of commonly occurring supersecondary structures (basic motifs) in protein data bank , 1996, Journal of protein chemistry.

[27]  J. Moult,et al.  An algorithm for determining the conformation of polypeptide segments in proteins by systematic search , 1986, Proteins.

[28]  R A Friesner,et al.  Prediction of loop geometries using a generalized born model of solvation effects , 1999, Proteins.

[29]  S. Bryant,et al.  Critical assessment of methods of protein structure prediction (CASP): Round II , 1997, Proteins.

[30]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[31]  N. Go,et al.  Ring Closure and Local Conformational Deformations of Chain Molecules , 1970 .

[32]  J. Moult,et al.  Ab initio structure prediction for small polypeptides and protein fragments using genetic algorithms , 1995, Proteins.

[33]  S. K. Kearsley Structural comparisons using restrained inhomogeneous transformations , 1989 .

[34]  J. Greer,et al.  Model for haptoglobin heavy chain based upon structural homology. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[35]  B. L. Sibanda,et al.  β-Hairpin families in globular proteins , 1985, Nature.

[36]  John P. Overington,et al.  18th Sir Hans Krebs lecture. Knowledge-based protein modelling and design. , 1988, European journal of biochemistry.

[37]  John P. Overington,et al.  Knowledge‐based protein modelling and design , 1988 .

[38]  A C Martin,et al.  Modeling antibody hypervariable loops: a combined algorithm. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[39]  A Tramontano,et al.  Conformations of the third hypervariable region in the VH domain of immunoglobulins. , 1998, Journal of molecular biology.

[40]  J. Garnier,et al.  Modeling of protein loops by simulated annealing , 1993, Protein science : a publication of the Protein Society.

[41]  S Vajda,et al.  Determining protein loop conformation using scaling‐relaxation techniques , 1993, Protein science : a publication of the Protein Society.

[42]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[43]  T. Blundell,et al.  Predicting the conformational class of short and medium size loops connecting regular secondary structures: application to comparative modelling. , 1997, Journal of molecular biology.

[44]  L. Lai,et al.  Protein loops on structurally similar scaffolds: database and conformational analysis. , 1999, Biopolymers.

[45]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[46]  B. L. Sibanda,et al.  Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. , 1989, Journal of molecular biology.

[47]  R Sánchez,et al.  Advances in comparative protein-structure modelling. , 1997, Current opinion in structural biology.

[48]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[49]  John P. Overington,et al.  Fragment ranking in modelling of protein structure. Conformationally constrained environmental amino acid substitution tables. , 1993, Journal of molecular biology.

[50]  S. Sudarsanam,et al.  Modeling protein loops using a ϕi+1, Ψi dimer database , 1995, Protein science : a publication of the Protein Society.

[51]  M. Karplus,et al.  PDB-based protein loop prediction: parameters for selection and methods for optimization. , 1997, Journal of molecular biology.

[52]  A. Lesk,et al.  Canonical structures for the hypervariable regions of immunoglobulins. , 1987, Journal of molecular biology.

[53]  J M Thornton,et al.  Long loops in proteins. , 1995, Protein engineering.

[54]  M. Karplus,et al.  Prediction of the folding of short polypeptide segments by uniform conformational sampling , 1987, Biopolymers.

[55]  Baldomero Oliva,et al.  An automated classification of the structure of protein loops. , 1997, Journal of molecular biology.

[56]  M. Swindells,et al.  Intrinsic φ,ψ propensities of amino acids, derived from the coil regions of known structures , 1995, Nature Structural Biology.

[57]  Luhua Lai,et al.  A fast and efficient program for modeling protein loops , 1997 .

[58]  T. Salakoski,et al.  Selection of a representative set of structures from brookhaven protein data bank , 1992, Proteins.

[59]  J L Sussman,et al.  Protein Data Bank archives of three-dimensional macromolecular structures. , 1997, Methods in enzymology.