Factors limiting the performance of prediction‐based fold recognition methods

In the past few years, a new generation of fold recognition methods has been developed, in which the classical sequence information is combined with information obtained from secondary structure and, sometimes, accessibility predictions. The results are promising, indicating that this approach may compete with potential‐based methods (Rost B et al., 1997, J Mol Biol 270:471–480). Here we present a systematic study of the different factors contributing to the performance of these methods, in particular when applied to the problem of fold recognition of remote homologues.

[1]  E. Lander The New Genomics: Global Views of Biology , 1996, Science.

[2]  M. Swindells,et al.  Intrinsic φ,ψ propensities of amino acids, derived from the coil regions of known structures , 1995, Nature Structural Biology.

[3]  S. Wodak,et al.  Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches. , 1994, Journal of molecular biology.

[4]  S. Wodak,et al.  Protein structure prediction by threading methods: Evaluation of current techniques , 1995, Proteins.

[5]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[6]  A. Godzik The structural alignment between two proteins: Is there a unique answer? , 1996, Protein science : a publication of the Protein Society.

[7]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[8]  S. Bryant,et al.  Statistics of sequence-structure threading. , 1995, Current opinion in structural biology.

[9]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[10]  J M Thornton,et al.  Sequences annotated by structure: a tool to facilitate the use of structural information in sequence analysis. , 1998, Protein engineering.

[11]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[12]  M J Sippl,et al.  Optimum superimposition of protein structures: ambiguities and implications. , 1996, Folding & design.

[13]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[14]  D Fischer,et al.  Fold assignments for amino acid sequences of the CASP2 experiment , 1997, Proteins.

[15]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[16]  M. Sternberg,et al.  Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. , 1997, Journal of molecular biology.

[17]  M. Karplus,et al.  Evaluation of comparative protein modeling by MODELLER , 1995, Proteins.

[18]  M J Sternberg,et al.  Recognition of analogous and homologous protein folds--assessment of prediction success and associated alignment accuracy using empirical substitution matrices. , 1998, Protein engineering.

[19]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[20]  D Eisenberg,et al.  A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. , 1997, Journal of molecular biology.

[21]  T. Smith,et al.  Alignment of protein sequences using secondary structure: a modified dynamic programming method. , 1990, Protein engineering.

[22]  G. Barton,et al.  Multiple protein sequence alignment from tertiary structure comparison: Assignment of global and residue confidence levels , 1992, Proteins.

[23]  Ram Samudrala,et al.  Confronting the problem of interconnected structural changes in the comparative modeling of proteins , 1995, Proteins.

[24]  C Sander,et al.  Prediction of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles derived from three-dimensional structures. , 1993, Journal of molecular biology.

[25]  G. Rose,et al.  Seeking an ancient enzyme in Methanococcus jannaschii using ORF, a program based on predicted secondary structure comparisons. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[26]  M. Sippl,et al.  Detection of native‐like models for amino acid sequences of unknown three‐dimensional structure in a data base of known protein conformations , 1992, Proteins.

[27]  B. Rost,et al.  Protein fold recognition by prediction-based threading. , 1997, Journal of molecular biology.

[28]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[29]  C. Murray,et al.  Protein fold recognition by threading: comparison of algorithms and analysis of results. , 1995, Protein engineering.

[30]  Andrew C. R. Martin,et al.  Assessment of comparative modeling in CASP2 , 1997, Proteins.

[31]  S H Bryant,et al.  A retrospective analysis of CASP2 threading predictions , 1997, Proteins.

[32]  G. Barton,et al.  Protein fold recognition by mapping predicted secondary structures. , 1996, Journal of molecular biology.

[33]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[34]  S. Bryant Evaluation of threading specificity and accuracy , 1996, Proteins.