Importance of anchor group positioning in protein loop prediction

The aim of loop prediction in protein homology modeling is to connect the main chain ends of two successive regions, conserved in template and target structures by protein fragments that are as similar to the target as possible. For the development of a new loop prediction method, examples of insertions and deletions were searched automatically in data sets of structurally aligned protein pairs. Three different criteria were applied for the determination of the positions where the main chain conformations of the proteins begin to differ, i.e., the anchoring groups of the insertions and deletions, giving three test data sets. The target structures in these data sets were predicted by inserting fragments from different fragment data banks between the anchoring groups of the templates. The proposals of matching fragments were sorted with decreasing correspondence in the geometry of the anchoring groups. For assessment of the prediction quality, the template loops were substituted by the proposed ones, and their root mean square deviations to the target structures were determined. In addition, the best 20 fragments in the whole loop data bank used—those with the lowest deviations from the target structures after insertion into the templates—were determined and compared with the proposals. The analysis of the results shows limitations of knowledge‐based loop prediction. It is demonstrated that the selection of the anchoring groups is the most important step in the whole procedure. Proteins 1999;37:56–64. © 1999 Wiley‐Liss, Inc.

[1]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[2]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[3]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[4]  J. Moult,et al.  An algorithm for determining the conformation of polypeptide segments in proteins by systematic search , 1986, Proteins.

[5]  M. Karplus,et al.  Prediction of the folding of short polypeptide segments by uniform conformational sampling , 1987, Biopolymers.

[6]  R. Diamond A note on the rotational superposition problem , 1988 .

[7]  J L Sussman,et al.  A 3D building blocks approach to analyzing and predicting structure of proteins , 1989, Proteins.

[8]  A C Martin,et al.  Modeling antibody hypervariable loops: a combined algorithm. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[9]  S. Wodak,et al.  Modelling the polypeptide backbone with 'spare parts' from known protein structures. , 1989, Protein engineering.

[10]  B. L. Sibanda,et al.  Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. , 1989, Journal of molecular biology.

[11]  M J Rooman,et al.  Automatic definition of recurrent local structure motifs in proteins. , 1990, Journal of molecular biology.

[12]  M Karplus,et al.  Modeling of globular proteins. A distance-based data search procedure for the construction of insertion/deletion regions and Pro----non-Pro mutations. , 1990, Journal of molecular biology.

[13]  M. Levitt Accurate modeling of protein conformation by automatic segment matching. , 1992, Journal of molecular biology.

[14]  U. Lessel,et al.  Similarities between protein 3-D structures. , 1994, Protein engineering.

[15]  K. Fidelis,et al.  Comparison of systematic search and database methods for constructing segments of protein structure. , 1994, Protein engineering.

[16]  M. James,et al.  A critical assessment of comparative molecular modeling of tertiary structures of proteins * , 1995, Proteins.

[17]  D. Schomburg,et al.  Prediction of protein three-dimensional structures in insertion and deletion regions: a procedure for searching data bases of representative protein fragments using geometric scoring criteria. , 1995, Journal of molecular biology.

[18]  S. Sudarsanam,et al.  Modeling protein loops using a ϕi+1, Ψi dimer database , 1995, Protein science : a publication of the Protein Society.

[19]  D J Kyle,et al.  Accuracy and reliability of the scaling‐relaxation method for loop closure: An evaluation based on extensive and multiple copy conformational samplings , 1996, Proteins.

[20]  T. Blundell,et al.  Conformational analysis and clustering of short and medium size loops connecting regular secondary structures: A database for modeling and prediction , 1996, Protein science : a publication of the Protein Society.

[21]  J. Kwasigroch,et al.  A global taxonomy of loops in globular proteins. , 1996, Journal of molecular biology.

[22]  Andrew J. Martin,et al.  Structural families in loops of homologous proteins: automatic classification, modelling and application to antibodies. , 1996, Journal of molecular biology.

[23]  J. Craig Venter,et al.  The first genome from the third domain of life , 1997, Nature.

[24]  David C. Jones,et al.  Progress in protein structure prediction. , 1997, Current opinion in structural biology.

[25]  U. Lessel,et al.  Creation and characterization of a new, non-redundant fragment data bank. , 1997, Protein engineering.

[26]  T. Blundell,et al.  Predicting the conformational class of short and medium size loops connecting regular secondary structures: application to comparative modelling. , 1997, Journal of molecular biology.

[27]  M. Karplus,et al.  PDB-based protein loop prediction: parameters for selection and methods for optimization. , 1997, Journal of molecular biology.

[28]  Baldomero Oliva,et al.  An automated classification of the structure of protein loops. , 1997, Journal of molecular biology.