A method for the improvement of threading‐based protein models

A new method for the homology‐based modeling of protein three‐dimensional structures is proposed and evaluated. The alignment of a query sequence to a structural template produced by threading algorithms usually produces low‐resolution molecular models. The proposed method attempts to improve these models. In the first stage, a high‐coordination lattice approximation of the query protein fold is built by suitable tracking of the incomplete alignment of the structural template and connection of the alignment gaps. These initial lattice folds are very similar to the structures resulting from standard molecular modeling protocols. Then, a Monte Carlo simulated annealing procedure is used to refine the initial structure. The process is controlled by the model's internal force field and a set of loosely defined restraints that keep the lattice chain in the vicinity of the template conformation. The internal force field consists of several knowledge‐based statistical potentials that are enhanced by a proper analysis of multiple sequence alignments. The template restraints are implemented such that the model chain can slide along the template structure or even ignore a substantial fraction of the initial alignment. The resulting lattice models are, in most cases, closer (sometimes much closer) to the target structure than the initial threading‐based models. All atom models could easily be built from the lattice chains. The method is illustrated on 12 examples of target/template pairs whose initial threading alignments are of varying quality. Possible applications of the proposed method for use in protein function annotation are briefly discussed. Proteins 1999;37:592–610. ©1999 Wiley‐Liss, Inc.

[1]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[2]  J. Skolnick,et al.  An Efficient Monte Carlo Model of Protein Chains. Modeling the Short-Range Correlations between Side Group Centers of Mass , 1998 .

[3]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[4]  D T Jones,et al.  Protein fold recognition by sequence threading: tools and assessment techniques , 1996, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[5]  A Kolinski,et al.  Neural network system for the evaluation of side-chain packing in protein structures. , 1995, Protein engineering.

[6]  Shoshana J. Wodak,et al.  Generating and testing protein folds , 1993 .

[7]  J. Skolnick,et al.  Assembly of protein structure from sparse experimental data: An efficient Monte Carlo model , 1998, Proteins.

[8]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[9]  S. Bryant,et al.  Threading a database of protein cores , 1995, Proteins.

[10]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[11]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[12]  A. Godzik,et al.  Regularities in interaction patterns of globular proteins. , 1993, Protein engineering.

[13]  K. Binder,et al.  The Monte Carlo Method in Condensed Matter Physics , 1992 .

[14]  J. Skolnick,et al.  MONSSTER: a method for folding globular proteins with a small number of distance restraints. , 1997, Journal of molecular biology.

[15]  T. Salakoski,et al.  Selection of a representative set of structures from brookhaven protein data bank , 1992, Proteins.

[16]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[17]  J. Skolnick,et al.  Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases. , 1998, Journal of molecular biology.

[18]  A. Godzik,et al.  Topology fingerprint approach to the inverse protein folding problem. , 1992, Journal of molecular biology.

[19]  A. Godzik,et al.  Similarities and differences between nonhomologous proteins with similar folds: evaluation of threading strategies. , 1997, Folding & design.

[20]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[21]  J. Skolnick,et al.  Lattice Models of Protein Folding, Dynamics and Thermodynamics , 1996 .

[22]  Adam Godzik,et al.  Multiple Model Approach: Exploring the Limits of Comparative Modeling , 1998 .

[23]  M. Billeter,et al.  MOLMOL: a program for display and analysis of macromolecular structures. , 1996, Journal of molecular graphics.

[24]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[25]  Leszek Rychlewski,et al.  Fold prediction by a hierarchy of sequence, threading, and modeling methods , 1998, Protein science : a publication of the Protein Society.

[26]  W R Taylor,et al.  Homology modelling by distance geometry. , 1996, Folding & design.

[27]  J Skolnick,et al.  Functional analysis of the Escherichia coli genome using the sequence-to-structure-to-function paradigm: identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity. , 1998, Journal of molecular biology.

[28]  A. Godzik,et al.  Sequence-structure specificity--how does an inverse folding approach work? , 1997, Protein engineering.