Algorithms for Protein Structural Motif Recognition

The identification of protein sequences that fold into certain known three-dimensional (3D) structures, or motifs, is evaluated through a probabilistic analysis of their one-dimensional (1D) sequences. We present a correlation method that runs in linear time and incorporates pairwise dependencies between amino acid residues at multiple distances to assess the conditional probability that a given residue is part of a given 3D structure. This method is generalized to multiple motifs, where a dynamic programming approach leads to an efficient algorithm that runs in linear time for practical problems. By this approach, we were able to distinguish (2-stranded) coiled-coil from non-coiled-coil domains and globins from nonglobins. When tested on the Brookhaven X-ray crystal structure database, the method does not produce any false-positive or false-negative predictions of coiled coils.

[1]  S. Karlin,et al.  Chance and statistical significance in protein and DNA sequence analysis. , 1992, Science.

[2]  Vincent A. Fischetti,et al.  Identifying Periodic Occurrences of a Template with Applications to Protein Structure , 1993, Inf. Process. Lett..

[3]  J. Berg,et al.  Redesigning the DNA‐binding specificity of a zinc finger protein: A data base‐guided approach , 1992, Proteins.

[4]  A. D. McLachlan,et al.  Secondary structure‐based profiles: Use of structure‐conserving scoring tables in searching protein sequence databases for structural similarities , 1991, Proteins.

[5]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[6]  M. Sippl,et al.  Detection of native‐like models for amino acid sequences of unknown three‐dimensional structure in a data base of known protein conformations , 1992, Proteins.

[7]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[8]  M. Gribskov,et al.  [9] Profile analysis , 1990 .

[9]  A. Lupas,et al.  Predicting coiled coils from protein sequences , 1991, Science.

[10]  P. S. Kim,et al.  A spring-loaded mechanism for the conformational change of influenza hemagglutinin , 1993, Cell.

[11]  P. Y. Chou,et al.  Empirical predictions of protein conformation. , 1978, Annual review of biochemistry.

[12]  A. Lesk,et al.  Determinants of a protein fold. Unique features of the globin amino acid sequences. , 1987, Journal of molecular biology.

[13]  Paul Schimmel,et al.  A simple structural feature is a major determinant of the identity of a transfer RNA , 1988, Nature.

[14]  A. Godzik,et al.  Topology fingerprint approach to the inverse protein folding problem. , 1992, Journal of molecular biology.

[15]  M. Waterman,et al.  Line geometries for sequence comparisons , 1984 .

[16]  P. S. Kim,et al.  Evidence that the leucine zipper is a coiled coil. , 1989, Science.

[17]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[18]  G. Casari,et al.  Identification of native protein folds amongst a large number of incorrect models. The calculation of low energy conformations from potentials of mean force. , 1990, Journal of molecular biology.

[19]  C. Sander,et al.  Specific recognition in the tertiary structure of β-sheets of proteins , 1980 .

[20]  G. Heijne,et al.  Some global β‐sheet characterstics , 1978 .