Prediction of the Secondary Structure Contents of Globular Proteins Based on Three Structural Classes

The prediction of the secondary structural contents (those of α-helix and β-strand) of a globular protein is of great use in the prediction of protein structure. In this paper, a new prediction algorithm has been proposed based on Chou's database [Chou (1995), Proteins21, 319–344]. The new algorithm is an improved multiple linear regression method, taking into account the nonlinear and coupling terms of the frequencies of different amino acids and the length of the protein. The prediction is also based on the structural classes of proteins, but instead of four classes, only three classes are considered, the α class, β class, and the mixed α+β and α/β class or simply the αβ class. Thus the ambiguity that usually occurs between α+β proteins and α/β proteins is eliminated. A resubstitution examination for the algorithm shows that the average absolute errors are 0.040 and 0.035 for the prediction of α-helix content and β-strand content, respectively. An examination of cross-validation, the jackknife analysis, shows that the average absolute errors are 0.051 and 0.045 for the prediction of α-helix content and β-strand content, respectively. Both examinations indicate the self-consistency and the extrapolating effectiveness of the new algorithm. Compared with other methods, ours has the merits of simplicity and convenience for use, as well as high prediction accuracy. By incorporating the prediction of the structural classes, the only input of our method is the amino acid composition and the length of the protein to be predicted.

[1]  R Langridge,et al.  Improvements in protein secondary structure prediction by an enhanced neural network. , 1990, Journal of molecular biology.

[2]  K Nishikawa,et al.  Correlation of the amino acid composition of a protein to its structural and biological characters. , 1982, Journal of biochemistry.

[3]  J. Richardson,et al.  Principles and Patterns of Protein Conformation , 1989 .

[4]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[5]  D. Davies,et al.  A CORRELATION BETWEEN AMINO ACID COMPOSITION AND PROTEIN STRUCTURE. , 1964, Journal of molecular biology.

[6]  C. DeLisi,et al.  Prediction of protein structural class from the amino acid sequence , 1986, Biopolymers.

[7]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[8]  W. R. Krigbaum,et al.  Prediction of the amount of secondary structure in a globular protein from its aminoacid composition. , 1973, Proceedings of the National Academy of Sciences of the United States of America.

[9]  P. Argos,et al.  Protein structure prediction: recognition of primary, secondary, and tertiary structural features from amino acid sequence. , 1995, Critical reviews in biochemistry and molecular biology.

[10]  K. Chou,et al.  An optimization approach to predicting protein structural class from amino acid composition , 1992, Protein science : a publication of the Protein Society.

[11]  C. Zhang,et al.  Predicting protein folding types by distance functions that make allowances for amino acid interactions. , 1994, The Journal of biological chemistry.

[12]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[13]  J M Thornton,et al.  Analysis of domain structural class using an automated class assignment protocol. , 1996, Journal of molecular biology.

[14]  I D Kuntz,et al.  Amino acid composition and hydrophobicity patterns of protein domains correlate with their structures , 1985, Biopolymers.

[15]  Chris Sander,et al.  How to determine protein secondary structure in solution by Raman spectroscopy: practical guide and test case DNase I , 1989 .

[16]  P Argos,et al.  Prediction of secondary structural content of proteins from their amino acid composition alone. I. New analytic vector decomposition methods , 1996, Proteins.

[17]  K. Nishikawa,et al.  Classification of proteins into groups based on amino acid composition and other characters. I. Angular distribution. , 1983, Journal of biochemistry.

[18]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[19]  Steven M. Muskal,et al.  Predicting protein secondary structure content. A tandem neural network approach. , 1992, Journal of molecular biology.

[20]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.

[21]  K Nishikawa,et al.  The folding type of a protein is relevant to the amino acid composition. , 1986, Journal of biochemistry.

[22]  N. Sreerama,et al.  Protein secondary structure from circular dichroism spectroscopy. Combining variable selection principle and cluster analysis with neural network, ridge regression and self-consistent methods. , 1994, Journal of molecular biology.

[23]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[24]  K. Chou A novel approach to predicting protein structural classes in a (20–1)‐D amino acid composition space , 1995, Proteins.

[25]  G. Fasman Prediction of Protein Structure and the Principles of Protein Conformation , 2012, Springer US.

[26]  P. Klein,et al.  Prediction of protein structural class by discriminant analysis. , 1986, Biochimica et biophysica acta.

[27]  S H Kim,et al.  Prediction of protein folding class from amino acid composition , 1993, Proteins.