Improved protein loop prediction from sequence alone.

The SLoop database of supersecondary fragments, first described by Donate et al. (Protein Sci., 1996, 5, 2600-2616), contains protein loops, classified according to structural similarity. The database has recently been updated and currently contains over 10 000 loops up to 20 residues in length, which cluster into over 560 well populated classes. The database can be found at http://www-cryst.bioc.cam.ac.uk/~sloop. In this paper, we identify conserved structural features such as main chain conformation and hydrogen bonding. Using the original approach of Rufino and co-workers (1997), the correct structural class is predicted with the highest SLoop score for 35% of loops. This rises to 65% by considering the three highest scoring class predictions and to 75% in the top five scoring class predictions. Inclusion of residues from the neighbouring secondary structures and use of substitution tables derived using a reduced definition of secondary structure increase these prediction accuracies to 58, 78 and 85%, respectively. This suggests that capping residues can stabilize the loop conformation as well as that of the secondary structure. Further increases are achieved if only well-populated classes are considered in the prediction. These results correspond to an average loop root mean square deviation of between 0.4 and 2.6 A for loops up to five residues in length.

[1]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[2]  L. Lai,et al.  Protein loops on structurally similar scaffolds: database and conformational analysis. , 1999, Biopolymers.

[3]  K. Fidelis,et al.  Comparison of systematic search and database methods for constructing segments of protein structure. , 1994, Protein engineering.

[4]  S. Wodak,et al.  Automatic classification and analysis of alpha alpha-turn motifs in proteins. , 1996, Journal of molecular biology.

[5]  H. Meirovitch,et al.  Backbone entropy of loops as a measure of their flexibility: Application to a Ras protein simulated by molecular dynamics , 1997, Proteins.

[6]  Baldomero Oliva,et al.  An automated classification of the structure of protein loops. , 1997, Journal of molecular biology.

[7]  C. Deane,et al.  A novel exhaustive search algorithm for predicting the conformation of polypeptide segments in proteins , 2000, Proteins.

[8]  J. Wójcik,et al.  New efficient statistical sequence-dependent structure prediction of short to medium-sized protein loops based on an exhaustive loop classification. , 1999, Journal of molecular biology.

[9]  M Karplus,et al.  Analysis of two-residue turns in proteins. , 1994, Journal of molecular biology.

[10]  B. L. Sibanda,et al.  β-Hairpin families in globular proteins , 1985, Nature.

[11]  B. L. Sibanda,et al.  Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. , 1989, Journal of molecular biology.

[12]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[13]  Charlotte M. Deane,et al.  Browsing the SLoop database of structurally classified loops connecting elements of protein secondary structure , 2000, Bioinform..

[14]  T L Blundell,et al.  Knowledge based modelling of homologous proteins, Part II: Rules for the conformations of substituted sidechains. , 1987, Protein engineering.

[15]  J. Kwasigroch,et al.  A global taxonomy of loops in globular proteins. , 1996, Journal of molecular biology.

[16]  M. Karplus,et al.  PDB-based protein loop prediction: parameters for selection and methods for optimization. , 1997, Journal of molecular biology.

[17]  A. Lesk,et al.  Conformations of immunoglobulin hypervariable regions , 1989, Nature.

[18]  Andrew J. Martin,et al.  Structural families in loops of homologous proteins: automatic classification, modelling and application to antibodies. , 1996, Journal of molecular biology.

[19]  S. Wodak,et al.  Modelling the polypeptide backbone with 'spare parts' from known protein structures. , 1989, Protein engineering.

[20]  John P. Overington,et al.  HOMSTRAD: A database of protein structure alignments for homologous families , 1998, Protein science : a publication of the Protein Society.

[21]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[22]  Jiří Novotný,et al.  Structure of antibody hypervariable loops reproduced by a conformational search algorithm , 1988, Nature.

[23]  M. Karplus,et al.  Conformational sampling using high‐temperature molecular dynamics , 1990, Biopolymers.

[24]  John P. Overington,et al.  Fragment ranking in modelling of protein structure. Conformationally constrained environmental amino acid substitution tables. , 1993, Journal of molecular biology.

[25]  T. Blundell,et al.  Predicting the conformational class of short and medium size loops connecting regular secondary structures: application to comparative modelling. , 1997, Journal of molecular biology.

[26]  M. Karplus,et al.  Prediction of the folding of short polypeptide segments by uniform conformational sampling , 1987, Biopolymers.

[27]  T. Blundell,et al.  Conformational analysis and clustering of short and medium size loops connecting regular secondary structures: A database for modeling and prediction , 1996, Protein science : a publication of the Protein Society.

[28]  John P. Overington,et al.  Knowledge‐based protein modelling and design , 1988 .