Comparative modeling in CASP5: Progress is evident, but alignment errors remain a significant hindrance

Models for 20 comparative modeling targets were submitted for the fifth round of the “blind” test of protein structure prediction methods (CASP5; http://predictioncenter.llnl.gov/casp5). The modeling approach used in CASP5 was similar to that used 2 years ago in CASP4 (Venclovas, Proteins 2001; Suppl 5:47–54). The main features of this approach include use of multiple templates, initial assessment of alignment reliability in a region‐specific manner, and structure‐based selection of alignment variants in unreliable regions. The CASP5 modeling results presented here show significant improvement in comparison to CASP4, especially in the area of distant homology. The improvements include more effective use of multiple templates and better alignments. However, a number of structurally conserved regions in submitted distant homology models were misaligned. Analysis of these errors indicates that the absolute majority of them occurred in regions deemed unreliable in the course of model building. Most of these error‐prone regions can be characterized by their peripheral location and a lack of conserved sequence patterns. For a few of the error‐prone regions, all methods evaluated during CASP5 proved ineffective, pointing to the need for more sensitive energy‐based methods. Despite these remaining issues, the applicability of comparative modeling continues to expand into more distant evolutionary relationships, providing a means to structurally characterize a significant number of currently available protein sequences. Proteins 2003;53:380–388. © 2003 Wiley‐Liss, Inc.

[1]  C Venclovas,et al.  Comparison of performance in successive CASP experiments , 2001, Proteins.

[2]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[3]  Ceslovas Venclovas,et al.  Assessment of progress over the CASP experiments , 2003, Proteins.

[4]  David C. Klein,et al.  The Structural Basis of Ordered Substrate Binding by Serotonin N-Acetyltransferase Enzyme Complex at 1.8 Å Resolution with a Bisubstrate Analog , 1999, Cell.

[5]  R Leplae,et al.  Analysis and assessment of comparative modeling predictions in CASP4 , 2001, Proteins.

[6]  G Vriend,et al.  WHAT IF: a molecular modeling and drug design program. , 1990, Journal of molecular graphics.

[7]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[8]  Anna Tramontano,et al.  Assessment of homology‐based predictions in CASP5 , 2003, Proteins.

[9]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[10]  Roland L. Dunbrack,et al.  Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool. , 1997, Journal of molecular biology.

[11]  C Venclovas,et al.  Processing and analysis of CASP3 protein structure predictions , 1999, Proteins.

[12]  F. Dyda,et al.  GCN5-related N-acetyltransferases: a structural overview. , 2000, Annual review of biophysics and biomolecular structure.

[13]  J. Ribeiro,et al.  Purification and cloning of the salivary nitrophorin from the hemipteran Cimex lectularius. , 1998, The Journal of experimental biology.

[14]  M. Sippl Recognition of errors in three‐dimensional structures of proteins , 1993, Proteins.

[15]  James H. Hurley,et al.  Specificity Determinants in Phosphoinositide Dephosphorylation Crystal Structure of an Archetypal Inositol Polyphosphate 5-Phosphatase , 2001, Cell.

[16]  C Venclovas,et al.  Comparative modeling of CASP4 target proteins: Combining results of sequence search with three‐dimensional structure assessment , 2001, Proteins.

[17]  Krzysztof Fidelis,et al.  Addressing the issue of sequence‐to‐structure alignments in comparative modeling of CASP3 target proteins , 1999, Proteins.

[18]  E V Koonin,et al.  DNA polymerase beta-like nucleotidyltransferase superfamily: identification of three new families, classification and evolutionary history. , 1999, Nucleic acids research.

[19]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[20]  Eugene V. Koonin,et al.  SEALS: A System for Easy Analysis of Lots of Sequences , 1997, ISMB.

[21]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[22]  E A Merritt,et al.  Raster3D: photorealistic molecular graphics. , 1997, Methods in enzymology.

[23]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.