Predicting the conformational class of short and medium size loops connecting regular secondary structures: application to comparative modelling.

Loops are regions of non-repetitive conformation connecting regular secondary structures. They are both the most difficult and error prone regions of a protein to solve by X-ray crystallography and the hardest regions to model using comparative procedures. Although a loop can sometimes be modelled from a homologue, very often it must be selected from outside the family. The loop prediction procedure, SLoop, attempts to identify the conformational class of the loop rather than to select a specific loop from a set of fragments extracted from known structures or generated ab initio. Templates are constructed for each of the 161 loop conformational classes that have been identified from the clustering of the structures of some 2024 loops of one to eight residues in length. A class template describes both sequence preferences and relative disposition of bounding secondary structures. During comparative modelling, the conformation of a loop can be predicted by identifying a loop class with which its sequence and disposition of bounding secondary structures are compatible. The procedure is tested on an unrelated non-redundant set of 1785 loops under stringent and lax evaluation schemes. Optimal sequence score cut-offs are identified such that the prediction rate is equal to the percentage of loops assigned to acceptable classes. Under the stringent evaluation, at the optimal sequence score cut-off, a conformation is predicted for 50% of loops of which 47% are correct, while under the lax evaluation a conformation is predicted for 63% of loops of which 54% are correct. Sequence score is shown to be a good indicator of the probability of a prediction being correct. Loop length also has a strong affect on prediction outcomes. Considering only loops of two to five residues in length, under the stringent evaluation 62% of loops are predicted with 52% of these predictions being correct while under the lax evaluation predictions are provided for 75% of loops of which 57% are correct.

[1]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[2]  C. Venkatachalam,et al.  Stereochemical criteria for polypeptides and proteins. VI. Non-bonded energy of polyglycine and poly-L-alanine in the crystalline beta-form. , 1968, Biochimica et biophysica acta.

[3]  C. Venkatachalam Stereochemical criteria for polypeptides and proteins. V. Conformation of a system of three linked peptide units , 1968, Biopolymers.

[4]  J. L. Crawford,et al.  The reverse turn as a polypeptide conformation in globular proteins. , 1973, Proceedings of the National Academy of Sciences of the United States of America.

[5]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[6]  J. Greer Comparative model-building of the mammalian serine proteases. , 1981, Journal of molecular biology.

[7]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[8]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[9]  A. Efimov,et al.  A novel super‐secondary structure of proteins and the relation between the structure and the amino acid sequence , 1984, FEBS letters.

[10]  Kuo-Chen Chou,et al.  Energetic approach to the packing of α-helices. II: General treatment of nonequivalent and nonregular helices , 1984 .

[11]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[12]  B. L. Sibanda,et al.  β-Hairpin families in globular proteins , 1985, Nature.

[13]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[14]  T. Blundell,et al.  Knowledge based modelling of homologous proteins, Part I: Three-dimensional frameworks derived from the simultaneous superposition of multiple structures. , 1987, Protein engineering.

[15]  T L Blundell,et al.  Knowledge based modelling of homologous proteins, Part II: Rules for the conformations of substituted sidechains. , 1987, Protein engineering.

[16]  Janet M. Thornton,et al.  Structural and sequence patterns in the loops of βαβ units , 1987 .

[17]  M. Karplus,et al.  Prediction of the folding of short polypeptide segments by uniform conformational sampling , 1987, Biopolymers.

[18]  J. Thornton,et al.  Analysis and prediction of the different types of β-turn in proteins , 1988 .

[19]  B. L. Sibanda,et al.  Analysis, design and modification of loop regions in proteins , 1988, BioEssays : news and reviews in molecular, cellular and developmental biology.

[20]  John P. Overington,et al.  Knowledge‐based protein modelling and design , 1988 .

[21]  Jiří Novotný,et al.  Structure of antibody hypervariable loops reproduced by a conformational search algorithm , 1988, Nature.

[22]  A. Lesk,et al.  Structural determinants of the conformations of medium‐sized loops in proteins , 1989, Proteins.

[23]  S. Kearsley On the orthogonal transformation used for structural comparisons , 1989 .

[24]  J L Sussman,et al.  A 3D building blocks approach to analyzing and predicting structure of proteins , 1989, Proteins.

[25]  A C Martin,et al.  Modeling antibody hypervariable loops: a combined algorithm. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[26]  S. Wodak,et al.  Modelling the polypeptide backbone with 'spare parts' from known protein structures. , 1989, Protein engineering.

[27]  A. Lesk,et al.  Conformations of immunoglobulin hypervariable regions , 1989, Nature.

[28]  B. L. Sibanda,et al.  Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. , 1989, Journal of molecular biology.

[29]  J. Thornton,et al.  Beta-turns and their distortions: a proposed new nomenclature. , 1990, Protein engineering.

[30]  M J Rooman,et al.  Automatic definition of recurrent local structure motifs in proteins. , 1990, Journal of molecular biology.

[31]  M Karplus,et al.  Modeling of globular proteins. A distance-based data search procedure for the construction of insertion/deletion regions and Pro----non-Pro mutations. , 1990, Journal of molecular biology.

[32]  John P. Overington,et al.  From comparisons of protein sequences and structures to protein modelling and design. , 1990, Trends in biochemical sciences.

[33]  John P. Overington,et al.  Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure prediction , 1990, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[34]  J. Greer Comparative modeling methods: Application to the family of the mammalian serine proteases , 1990, Proteins.

[35]  T. Blundell,et al.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. , 1990, Journal of molecular biology.

[36]  R. Huber,et al.  Accurate Bond and Angle Parameters for X-ray Protein Structure Refinement , 1991 .

[37]  A. Efimov,et al.  Structure of coiled β‐β‐hairpins and β‐β‐corners , 1991 .

[38]  J. Greer,et al.  Comparative modeling of proteins in the design of novel renin inhibitors. , 1991, Critical reviews in biochemistry and molecular biology.

[39]  A. V. Efimov,et al.  Structure of α-α-hairpins with short connections , 1991 .

[40]  Timothy F. Havel,et al.  A new method for building protein conformations from sequence alignments with homologues of known structure. , 1991, Journal of molecular biology.

[41]  R. Sowdhamini,et al.  Analysis of short loops connecting secondary structural elements in proteins , 1991 .

[42]  J. Thornton,et al.  Stereochemical quality of protein structure coordinates , 1992, Proteins.

[43]  R. Sowdhamini,et al.  Orthogonal ββ motifs in proteins , 1992 .

[44]  M. Levitt Accurate modeling of protein conformation by automatic segment matching. , 1992, Journal of molecular biology.

[45]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[46]  R. Perham,et al.  Prediction of the three‐dimensional structures of the biotinylated domain from yeast pyruvate carboxylase and of the lipoylated H‐protein from the pea leaf glycine cleavage system: A new automated method for the prediction of protein tertiary structure , 1993, Protein science : a publication of the Protein Society.

[47]  Alexander V. Efimov,et al.  Patterns of loop regions in proteins , 1993 .

[48]  B. L. Sibanda,et al.  Accommodating sequence changes in β-hairpins in proteins , 1993 .

[49]  S. Sudarsanam,et al.  An automated method for modeling proteins on known templates using distance geometry , 1993, Protein science : a publication of the Protein Society.

[50]  R. Sowdhamini,et al.  Modelling multiple disulphide loop containing polypeptides by random conformation generation. The test cases of alpha-conotoxin GI and endothelin I. , 1993, Protein engineering.

[51]  Qiang Zheng,et al.  Loop closure via bond scaling and relaxation , 1993, J. Comput. Chem..

[52]  T L Blundell,et al.  An evaluation of the performance of an automated procedure for comparative modelling of protein tertiary structure. , 1993, Protein engineering.

[53]  J. Garnier,et al.  Modeling of protein loops by simulated annealing , 1993, Protein science : a publication of the Protein Society.

[54]  S Vajda,et al.  Determining protein loop conformation using scaling‐relaxation techniques , 1993, Protein science : a publication of the Protein Society.

[55]  J Bajorath,et al.  Knowledge‐based model building of proteins: Concepts and examples , 1993, Protein science : a publication of the Protein Society.

[56]  T. P. Flores,et al.  Identification and classification of protein fold families. , 1993, Protein engineering.

[57]  M. Snow A novel parameterization scheme for energy equations and its use to calculate the structure of protein molecules , 1993, Proteins.

[58]  John P. Overington,et al.  Fragment ranking in modelling of protein structure. Conformationally constrained environmental amino acid substitution tables. , 1993, Journal of molecular biology.

[59]  T. Blundell,et al.  Knowledge-based protein modeling. , 1994, Critical reviews in biochemistry and molecular biology.

[60]  Fred E. Cohen,et al.  Conformational Sampling of Loop Structures Using Genetic Algorithms , 1994 .

[61]  John P. Overington,et al.  Derivation of rules for comparative protein modeling from a database of protein structure alignments , 1994, Protein science : a publication of the Protein Society.

[62]  S. Sudarsanam,et al.  Homology modeling of divergent proteins. , 1994, Journal of molecular biology.

[63]  Rakefet Rosenfeld,et al.  Simultaneous modeling of multiple loops in proteins , 1995, Protein science : a publication of the Protein Society.

[64]  Tom L. Blundell,et al.  The pattern of common supersecondary structure (motifs) in protein database , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[65]  T L Blundell,et al.  An automatic method involving cluster analysis of secondary structures for the identification of domains in proteins , 1995, Protein science : a publication of the Protein Society.

[66]  D. Schomburg,et al.  Prediction of protein three-dimensional structures in insertion and deletion regions: a procedure for searching data bases of representative protein fragments using geometric scoring criteria. , 1995, Journal of molecular biology.

[67]  Andrej ⩽ali,et al.  Comparative protein modeling by satisfaction of spatial restraints , 1995 .

[68]  S. Sudarsanam,et al.  Modeling protein loops using a ϕi+1, Ψi dimer database , 1995, Protein science : a publication of the Protein Society.

[69]  T. Blundell,et al.  Conformational analysis and clustering of short and medium size loops connecting regular secondary structures: A database for modeling and prediction , 1996, Protein science : a publication of the Protein Society.

[70]  S. Wodak,et al.  Automatic classification and analysis of alpha alpha-turn motifs in proteins. , 1996, Journal of molecular biology.