Fast protein fold recognition via sequence to structure alignment and contact capacity potentials.

We propose new empirical scoring potentials and associated alignment procedures for optimally aligning protein sequences to protein structures. The method has two main applications: first, the recognition of a plausible fold for a protein sequence of unknown structure out of a database of representative protein structures and, second, the improvement of sequence alignments by using structural information in order to find a better starting point for homology based modelling. The empirical scoring function is derived from an analysis of a nonredundant database of known structures by converting relative frequencies into pseudoenergies using a normalization according to the inverse Bolzmann law. These-so called contact capacity-potentials turn out to be discriminative enough to detect structural folds in the absence of significant sequence similarity and at the same time simple enough to allow for a very fast optimization in an alignment procedure.

[1]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[2]  U. Hobohm,et al.  Enlarged representative set of protein structures , 1994, Protein science : a publication of the Protein Society.

[3]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[4]  Thomas Lengauer,et al.  Recursive Dynamic Programming for Adaptive Sequence and Structure Alignment , 1995, ISMB.

[5]  T. Smith,et al.  Optimal sequence alignments. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[6]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.

[7]  A. Beyer,et al.  An improved pair potential to recognize native protein folds , 1994, Proteins.

[8]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[9]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[10]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[11]  M. I. Kanehisa,et al.  Pattern recognition in nucleic acid sequences. I. A general method for finding local homologies and symmetries , 1982, Nucleic Acids Res..

[12]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[13]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[14]  O B Ptitsyn How does protein synthesis give rise to the 3D‐structure? , 1991, FEBS letters.

[15]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[16]  G J Barton,et al.  Structural features can be unconserved in proteins with similar folds. An analysis of side-chain to side-chain contacts secondary structure and accessibility. , 1994, Journal of molecular biology.