Loops In Proteins (LIP)--a comprehensive loop database for homology modelling.

One of the most important and challenging tasks in protein modelling is the prediction of loops, as can be seen in the large variety of existing approaches. Loops In Proteins (LIP) is a database that includes all protein segments of a length up to 15 residues contained in the Protein Data Bank (PDB). In this study, the applicability of LIP to loop prediction in the framework of homology modelling is investigated. Searching the database for loop candidates takes less than 1 s on a desktop PC, and ranking them takes a few minutes. This is an order of magnitude faster than most existing procedures. The measure of accuracy is the root mean square deviation (RMSD) with respect to the main-chain atoms after local superposition of target loop and predicted loop. Loops of up to nine residues length were modelled with a local RMSD <1 A and those of length up to 14 residues with an accuracy better than 2 A. The results were compared in detail with a thoroughly evaluated and tested ab initio method published recently and additionally with two further methods for a small loop test set. The LIP method produced very good predictions. In particular for longer loops it outperformed other methods.

[1]  J. Greer Comparative model-building of the mammalian serine proteases. , 1981, Journal of molecular biology.

[2]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[3]  M. Karplus,et al.  Prediction of the folding of short polypeptide segments by uniform conformational sampling , 1987, Biopolymers.

[4]  M. Karplus,et al.  PDB-based protein loop prediction: parameters for selection and methods for optimization. , 1997, Journal of molecular biology.

[5]  R Samudrala,et al.  A graph-theoretic algorithm for comparative modeling of protein structure. , 1998, Journal of molecular biology.

[6]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[7]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[8]  R Leplae,et al.  Analysis and assessment of comparative modeling predictions in CASP4 , 2001, Proteins.

[9]  C. Deane,et al.  CODA: A combined algorithm for predicting the structurally variable regions of protein models , 2001, Protein science : a publication of the Protein Society.

[10]  D. Baker,et al.  Protein structure prediction in 2002. , 2002, Current opinion in structural biology.

[11]  Eckart Bindewald,et al.  A divide and conquer approach to fast loop modeling. , 2002, Protein engineering.

[12]  M. DePristo,et al.  Ab initio construction of polypeptide fragments: Efficient generation of accurate, representative ensembles , 2003, Proteins.