Discrete representations of the protein Cα chain

Background: When a large number of protein conformations are generated and screened, as in protein structure prediction studies, it is often advantageous to change the conformation in units of four consecutive residues at a time. The internal geometry of a chain of four consecutive C α atoms is completely described by means of the three angles θ 1, τ , and θ 2, where τ is the virtual torsion angle defined by the four atoms and θ 1 and θ 2 are the virtual bond angles flanking the torsion angle on either side. In this paper, we examine the quality of the protein structures that can be obtained when they are represented by means of a set of discrete values for these angles (discrete states). Results: Different models were produced by selecting various different discrete states. The performance of these models was tested by rebuilding the C α chains of 139 high-resolution nonhomologous protein structures using the build-up procedure of Park and Levitt. We find that the discrete state models introduce distortions at three levels, which can be measured by means of the ‘context-free', ‘in-context', and the overall root-mean-square deviation of the C α coordinates (crms), respectively, and we find that these different levels of distortions are interrelated. As found by Park and Levitt, the overall crms decreases smoothly for most models with the complexity of the model. However, the decrease is significantly faster with our models than observed by Park and Levitt with their models. We also find that it is possible to choose models that perform considerably worse than expected from this smooth dependence on complexity. Conclusions: Of our models, the most suitable for use in initial protein folding studies appears to be model S8, in which the effective number of states available for a given residue quartet is 6.5. This model builds helices, β -strands, and coil/loop structures with approximately equal quality and gives the overall crms value of 1.9 A on average with relatively little variation among the different proteins tried.

[1]  P Argos,et al.  Identifying the tertiary fold of small proteins with different topologies from sequence and secondary structure using the genetic algorithm and extended criteria specific for strand regions. , 1996, Journal of molecular biology.

[2]  S. Wodak,et al.  Prediction of protein backbone conformation based on seven structure assignments. Influence of local interactions. , 1991, Journal of molecular biology.

[3]  Roderick E. Hubbard,et al.  Analysis of Cα geometry in protein structures , 1994 .

[4]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[5]  E. Shakhnovich,et al.  Pseudodihedrals: Simplified protein backbone representation with knowledge‐based energy , 1994, Protein science : a publication of the Protein Society.

[6]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[7]  M. Levitt A simplified representation of protein conformations for rapid simulation of protein folding. , 1976, Journal of molecular biology.

[8]  Michael Levitt,et al.  Protein folding: Current Opinion in Structural Biology 1991, 1:224–229 , 1991 .

[9]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[10]  P Argos,et al.  Folding the main chain of small proteins with the genetic algorithm. , 1994, Journal of molecular biology.

[11]  Robert L. Jernigan,et al.  Protein folds: Current Opinion in Structural Biology 1992, 2:248–256 , 1992 .

[12]  T J Oldfield,et al.  Analysis of C alpha geometry in protein structures. , 1994, Proteins.

[13]  J. Skolnick,et al.  Monte carlo simulations of protein folding. I. Lattice model and interaction scheme , 1994, Proteins.

[14]  T. Creighton,et al.  Protein Folding , 1992 .

[15]  K. Dill,et al.  The Protein Folding Problem , 1993 .

[16]  J. Szulmajster Protein folding , 1988, Bioscience reports.

[17]  M. Levitt,et al.  The complexity and accuracy of discrete state models of protein structure. , 1995, Journal of molecular biology.

[18]  J. Thornton,et al.  A revised set of potentials for beta-turn formation in proteins. , 1994, Protein science : a publication of the Protein Society.

[19]  S L Mowbray,et al.  Cα‐based torsion angles: A simple tool to analyze protein conformational changes , 1995, Protein science : a publication of the Protein Society.

[20]  Jaap Heringa,et al.  OBSTRUCT: a program to obtain largest cliques from a protein sequence set according to structural resolution and sequence similarity , 1992, Comput. Appl. Biosci..

[21]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[22]  Protein folds , 1992, Current Biology.

[23]  W. Kabsch A discussion of the solution for the best rotation to relate two sets of vectors , 1978 .

[24]  T. P. Flores,et al.  Identification and classification of protein fold families. , 1993, Protein engineering.