Prediction of the conformation and geometry of loops in globular proteins: Testing ArchDB, a structural classification of loops

In protein structure prediction, a central problem is defining the structure of a loop connecting 2 secondary structures. This problem frequently occurs in homology modeling, fold recognition, and in several strategies in ab initio structure prediction. In our previous work, we developed a classification database of structural motifs, ArchDB. The database contains 12,665 clustered loops in 451 structural classes with information about ϕ–ψ angles in the loops and 1492 structural subclasses with the relative locations of the bracing secondary structures. Here we evaluate the extent to which sequence information in the loop database can be used to predict loop structure. Two sequence profiles were used, a HMM profile and a PSSM derived from PSI‐BLAST. A jack‐knife test was made removing homologous loops using SCOP superfamily definition and predicting afterwards against recalculated profiles that only take into account the sequence information. Two scenarios were considered: (1) prediction of structural class with application in comparative modeling and (2) prediction of structural subclass with application in fold recognition and ab initio. For the first scenario, structural class prediction was made directly over loops with X‐ray secondary structure assignment, and if we consider the top 20 classes out of 451 possible classes, the best accuracy of prediction is 78.5%. In the second scenario, structural subclass prediction was made over loops using PSI‐PRED (Jones, J Mol Biol 1999;292:195–202) secondary structure prediction to define loop boundaries, and if we take into account the top 20 subclasses out of 1492, the best accuracy is 46.7%. Accuracy of loop prediction was also evaluated by means of RMSD calculations. Proteins 2005. © 2005 Wiley‐Liss, Inc.

[1]  N. Nakajima,et al.  Enhanced conformational diversity search of CDR‐H3 in antibodies: Role of the first CDR‐H3 residue , 1999, Proteins.

[2]  J. Tainer,et al.  DNA repair proteins. , 1995, Current opinion in structural biology.

[3]  S. Wodak,et al.  Automatic classification and analysis of alpha alpha-turn motifs in proteins. , 1996, Journal of molecular biology.

[4]  A. Sali,et al.  Evolution and physics in comparative protein structure modeling. , 2002, Accounts of chemical research.

[5]  A Sali,et al.  Comparative protein modeling by satisfaction of spatial restraints. , 1996, Molecular medicine today.

[6]  Sven Hovmöller,et al.  Prediction of Protein Structure , 2004, Numerical Computer Methods, Part D.

[7]  Baldomero Oliva,et al.  ArchDB: automated protein loop classification as a tool for structural genomics , 2004, Nucleic Acids Res..

[8]  Ceslovas Venclovas,et al.  Assessment of progress over the CASP experiments , 2003, Proteins.

[9]  J. Moult,et al.  An algorithm for determining the conformation of polypeptide segments in proteins by systematic search , 1986, Proteins.

[10]  Baldomero Oliva,et al.  An automated classification of the structure of protein loops. , 1997, Journal of molecular biology.

[11]  M. Levitt Accurate modeling of protein conformation by automatic segment matching. , 1992, Journal of molecular biology.

[12]  Patrice Koehl,et al.  ASTRAL compendium enhancements , 2002, Nucleic Acids Res..

[13]  M. Karplus,et al.  PDB-based protein loop prediction: parameters for selection and methods for optimization. , 1997, Journal of molecular biology.

[14]  Charlotte M. Deane,et al.  Browsing the SLoop database of structurally classified loops connecting elements of protein secondary structure , 2000, Bioinform..

[15]  J. Wójcik,et al.  New efficient statistical sequence-dependent structure prediction of short to medium-sized protein loops based on an exhaustive loop classification. , 1999, Journal of molecular biology.

[16]  U. Lessel,et al.  Importance of anchor group positioning in protein loop prediction , 1999, Proteins.

[17]  M J Sippl,et al.  Structure-based evaluation of sequence comparison and fold recognition alignment accuracy. , 2000, Journal of molecular biology.

[18]  R. Schulz,et al.  Protein Structure Prediction , 2020, Methods in Molecular Biology.

[19]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[20]  J. Fetrow Omega loops; nonregular secondary structures significant in protein function and stability , 1995, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[21]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[22]  A. Goede,et al.  Loops In Proteins (LIP)--a comprehensive loop database for homology modelling. , 2003, Protein engineering.

[23]  Eckart Bindewald,et al.  A divide and conquer approach to fast loop modeling. , 2002, Protein engineering.

[24]  L. Johnson,et al.  The structural basis for substrate recognition and control by protein kinases 1 , 1998 .

[25]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[26]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[27]  L. Johnson,et al.  The Eleventh Datta Lecture. The structural basis for substrate recognition and control by protein kinases. , 1998, FEBS letters.

[28]  C. Levinthal,et al.  Predicting antibody hypervariable loop conformations II: Minimization and molecular dynamics studies of MCPC603 from many randomly generated loop conformations , 1986, Proteins.

[29]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[30]  P. Terpstra,et al.  Prediction of the occurrence of the ADP-binding beta alpha beta-fold in proteins, using an amino acid sequence fingerprint. , 1986, Journal of molecular biology.

[31]  Dietmar Schomburg,et al.  Efficient methods for filtering and ranking fragments for the prediction of structurally variable regions in proteins , 2004, Proteins.

[32]  A. Lesk,et al.  Canonical structures for the hypervariable regions of immunoglobulins. , 1987, Journal of molecular biology.

[33]  M. Karplus,et al.  Prediction of the folding of short polypeptide segments by uniform conformational sampling , 1987, Biopolymers.

[34]  K. Fidelis,et al.  Comparison of systematic search and database methods for constructing segments of protein structure. , 1994, Protein engineering.

[35]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[36]  D. Baker,et al.  Protein structure prediction in 2002. , 2002, Current opinion in structural biology.

[37]  P. R. Sibbald,et al.  The P-loop--a common motif in ATP- and GTP-binding proteins. , 1990, Trends in biochemical sciences.

[38]  H. Kawasaki,et al.  Calcium-binding proteins. 1: EF-hands. , 1994, Protein profile.

[39]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[40]  M. Sternberg,et al.  Enhanced genome annotation using structural profiles in the program 3D-PSSM. , 2000, Journal of molecular biology.

[41]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[42]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[43]  C. Deane,et al.  CODA: A combined algorithm for predicting the structurally variable regions of protein models , 2001, Protein science : a publication of the Protein Society.

[44]  Ronald M Levy,et al.  Have we seen all structures corresponding to short protein fragments in the Protein Data Bank? An update. , 2003, Protein engineering.

[45]  B. L. Sibanda,et al.  Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. , 1989, Journal of molecular biology.

[46]  M. Jaskólski,et al.  Conserved folding in retroviral proteases: crystal structure of a synthetic HIV-1 protease. , 1989, Science.

[47]  Tim J. P. Hubbard,et al.  SCOP database in 2002: refinements accommodate structural genomics , 2002, Nucleic Acids Res..

[48]  P. Terpstra,et al.  Prediction of the Occurrence of the ADP-binding βαβ-fold in Proteins, Using an Amino Acid Sequence Fingerprint , 1986 .

[49]  A. Sali,et al.  Modeling of loops in protein structures , 2000, Protein science : a publication of the Protein Society.

[50]  C M Deane,et al.  Improved protein loop prediction from sequence alone. , 2001, Protein engineering.