Failures of inverse folding and threading with gapped alignment

To calculate the tertiary structure of a protein from its amino acid sequence, the thermodynamic approach requires a potential function of sequence and conformation that has its global minimum at the native conformation for many different proteins. Here we study the behavior of such functions for the simplest model system that still has some of the features of the protein folding problem, namely two‐dimensional square lattice chain configurations involving two residue types. First we show that even the given contact potential, which by definition is used to identify the folding sequences and their unique native conformations, cannot always correctly select which sequences will fold to a given structure. Second, we demonstrate that the given contact potential is not always able to favor the native alignment of a native sequence on its own native conformation over other gapped alignments of different folding sequences onto that same conformation. Because of these shortcomings, even in this simple model system in which all conformations and all native sequences are known and determined directly by the given potential, we must reexamine our expectations for empirical potentials used for inverse folding and gapped alignment on more realistic representations of proteins. © 1996 Wiley‐Liss, Inc.

[1]  G. Crippen,et al.  Contact potential that recognizes the correct folding of globular proteins. , 1992, Journal of molecular biology.

[2]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[3]  S. Bryant,et al.  Statistics of sequence-structure threading. , 1995, Current opinion in structural biology.

[4]  Manfred J. Sippl,et al.  Boltzmann's principle, knowledge-based mean fields and protein folding. An approach to the computational determination of protein structures , 1993, J. Comput. Aided Mol. Des..

[5]  K. Dill,et al.  Statistical potentials extracted from protein structures: how accurate are they? , 1996, Journal of molecular biology.

[6]  D Eisenberg,et al.  Inverse protein folding by the residue pair preference profile method: estimating the correctness of alignments of structurally compatible sequences. , 1995, Protein engineering.

[7]  Temple F. Smith,et al.  Global optimum protein threading with gapped alignment and empirical pair score functions. , 1996, Journal of molecular biology.

[8]  W. Lim,et al.  Deciphering the message in protein sequences: tolerance to amino acid substitutions. , 1990, Science.

[9]  A Godzik,et al.  In search of the ideal protein sequence. , 1995, Protein engineering.

[10]  C Sander,et al.  Prediction of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles derived from three-dimensional structures. , 1993, Journal of molecular biology.

[11]  N. D. Clarke,et al.  Identification of protein folds: Matching hydrophobicity patterns of sequence sets with solvent accessibility patterns of known structures , 1990, Proteins.

[12]  David T. Jones,et al.  Protein superfamilles and domain superfolds , 1994, Nature.

[13]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[14]  G. Crippen,et al.  Learning about protein folding via potential functions , 1994, Proteins.

[15]  M J Rooman,et al.  Are database-derived potentials valid for scoring both forward and inverted protein folding? , 1995, Protein engineering.

[16]  K. Dill,et al.  Transition states and folding dynamics of proteins and heteropolymers , 1994 .

[17]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[18]  M. Sippl,et al.  Detection of native‐like models for amino acid sequences of unknown three‐dimensional structure in a data base of known protein conformations , 1992, Proteins.