Accurate Prediction of Protein Secondary Structural Content

An improved multiple linear regression (MLR) method is proposed to predict a protein's secondary structural content based on its primary sequence. The amino acid composition, the autocorrelation function, and the interaction function of side-chain mass derived from the primary sequence are taken into account. The average absolute errors of prediction over 704 unrelated proteins with the jackknife test are 0.088, 0.081, and 0.059 with standard deviations 0.073, 0.066, and 0.055 for α-helix, β-sheet, and coil, respectively. That the sum of predicted secondary structure content should be close to 1.0 was introduced as a criterion to evaluate whether the prediction is acceptable. While only the predictions with the sum of predicted secondary structure content between 0.99 and 1.01 are accepted (about 11% of all proteins), the absolute errors are 0.058 for α-helix, 0.054 for β-sheet, and 0.045 for coil.

[1]  Ming Yan,et al.  Prediction of the helix/strand content of globular proteins based on their primary sequences. , 1998, Protein engineering.

[2]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[3]  V. Lim Structural principles of the globular organization of protein chains. A stereochemical theory of globular protein secondary structure. , 1974, Journal of molecular biology.

[4]  P. Y. Chou,et al.  Prediction of protein conformation. , 1974, Biochemistry.

[5]  J M Chandonia,et al.  New methods for accurate prediction of protein secondary structure , 1999, Proteins.

[6]  C. Zhang,et al.  A new approach to predict the helix/strand content of globular proteins. , 2001, Journal of theoretical biology.

[7]  Xian-Ming Pan,et al.  New method for accurate prediction of solvent accessibility from protein sequence , 2001, Proteins.

[8]  P Argos,et al.  Prediction of secondary structural content of proteins from their amino acid composition alone. I. New analytic vector decomposition methods , 1996, Proteins.

[9]  S H Kim,et al.  Predicting protein secondary structure content. A tandem neural network approach. , 1992, Journal of molecular biology.

[10]  P Argos,et al.  Prediction of secondary structural content of proteins from their amino acid composition alone. II. The paradox with secondary structural class , 1996, Proteins.

[11]  Zheng Yuan,et al.  How good is prediction of protein structural class by the component‐coupled method? , 2000, Proteins.