Origins of structural diversity within sequentially identical hexapeptides

Efforts to predict protein secondary structure have been hampered by the apparent structural plasticity of local amino acid sequences. Kabsch and Sander (1984, Proc. Natl. Acad. Sci. USA 81, 1075–1078) articulated this problem by demonstrating that identical pentapeptide sequences can adopt distinct structures in different proteins. With the increased size of the protein structure database and the availability of new methods to characterize structural environments, we revisit this observation of structural plasticity. Within a set of proteins with less than 50% sequence identity, 59 pairs of identical hexapeptide sequences were identified. These local structures were compared and their surrounding structural environments examined. Within a protein structural class (α/α, β/β, α/β, α + β), the structural similarity of sequentially identical hexapeptides usually is preserved. This study finds eight pairs of identical hexapeptide sequences that adopt β‐strand structure in one protein and α‐helical structure in the other. In none of the eight cases do the members of these sequence pairs come from proteins within the same folding class. These results have implications for class dependent secondary structure prediction algorithms.

[1]  C. DeLisi,et al.  Prediction of protein structural class from the amino acid sequence , 1986, Biopolymers.

[2]  R A Smith,et al.  Human interleukin 4. The solution structure of a four-helix bundle protein. , 1992, Journal of molecular biology.

[3]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[4]  Mal'tsev Ni,et al.  A study of pepsin specificity in transpeptidation reactions , 1966 .

[5]  F. Richards,et al.  Identification of structural motifs from protein coordinate data: Secondary structure and first‐level supersecondary structure * , 1988, Proteins.

[6]  K. Dill,et al.  Origins of structure in globular proteins. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[7]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[8]  F. Cohen,et al.  Taxonomy and conformational analysis of loops in proteins. , 1992, Journal of molecular biology.

[9]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[10]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[11]  S. Walter Englander,et al.  Structural characterization of folding intermediates in cytochrome c by H-exchange labelling and proton NMR , 1988, Nature.

[12]  M J Sternberg,et al.  Prediction of protein structure from amino acid sequence. , 1978, Biochemical Society transactions.

[13]  R. F. Smith,et al.  Automatic generation of primary sequence patterns from sets of related protein sequences. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Scott R. Presnell,et al.  Experimental and theoretical studies of the three‐dimensional structure of human interleukin‐4 , 1991, Proteins.

[15]  P E Wright,et al.  Structural characterization of a partly folded apomyoglobin intermediate. , 1990, Science.

[16]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[17]  W. C. Johnson,et al.  Environment affects amino acid preference for secondary structure. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[18]  I D Kuntz,et al.  Amino acid composition and hydrophobicity patterns of protein domains correlate with their structures , 1985, Biopolymers.

[19]  Scott R. Presnell,et al.  A segment-based approach to protein secondary structure prediction. , 1991, Biochemistry.

[20]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.

[21]  P. Gács,et al.  Algorithms , 1992 .

[22]  W. Kauzmann Some factors in the interpretation of protein denaturation. , 1959, Advances in protein chemistry.

[23]  S. Benner,et al.  Patterns of divergence in homologous proteins as indicators of secondary and tertiary structure: a prediction of the structure of the catalytic domain of protein kinases. , 1991, Advances in enzyme regulation.

[24]  S. Wodak,et al.  Extracting information on folding from the amino acid sequence: accurate predictions for protein regions with preferred conformation in the absence of tertiary interactions. , 1992, Biochemistry.

[25]  J L Sussman,et al.  A 3D building blocks approach to analyzing and predicting structure of proteins , 1989, Proteins.

[26]  H. Scheraga,et al.  Calculation of protein conformation by the build-up procedure. Application to bovine pancreatic trypsin inhibitor using limited simulated nuclear magnetic resonance data. , 1988, Journal of biomolecular structure & dynamics.

[27]  B. Lee,et al.  The interpretation of protein structures: estimation of static accessibility. , 1971, Journal of molecular biology.

[28]  Shoshana J. Wodak,et al.  Identification of predictive sequence motifs limited by protein structure data base size , 1988, Nature.

[29]  C Sander,et al.  On the use of sequence homologies to predict protein structure: identical pentapeptides can have completely different conformations. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[30]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[31]  G. Barton,et al.  Conservation analysis and structure prediction of the SH2 family of phosphotyrosine binding domains , 1992, FEBS letters.

[32]  R Langridge,et al.  Improvements in protein secondary structure prediction by an enhanced neural network. , 1990, Journal of molecular biology.

[33]  R. Lerner,et al.  Identical short peptide sequences in unrelated proteins can have different conformations: a testing ground for theories of immune recognition. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[34]  A. Lesk,et al.  Conformations of immunoglobulin hypervariable regions , 1989, Nature.

[35]  P. Y. Chou,et al.  Empirical predictions of protein conformation. , 1978, Annual review of biochemistry.

[36]  Conrad C. Huang,et al.  The MIDAS display system , 1988 .

[37]  N R Kallenbach,et al.  Alpha-helix stabilization by natural and unnatural amino acids with alkyl side chains. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[38]  C. Chothia,et al.  Hydrophobic bonding and accessible surface area in proteins , 1974, Nature.

[39]  H A Scheraga,et al.  Variable-target-function and build-up procedures for the calculation of protein conformation. Application to bovine pancreatic trypsin inhibitor using limited simulated nuclear magnetic resonance data. , 1988, Journal of biomolecular structure & dynamics.

[40]  C Sander,et al.  Progress in protein structure prediction? , 1993, Trends in biochemical sciences.

[41]  S H Kim,et al.  Predicting protein secondary structure content. A tandem neural network approach. , 1992, Journal of molecular biology.