A new approach to predict the helix/strand content of globular proteins.

An improved multiple linear regression method has been proposed to predict the content of alpha-helix and beta-strand of a globular protein based on its primary sequence and structural class. The amino acid composition and the auto-correlation functions derived from the hydrophobicity profile of the primary sequence have been taken into account. However, only the compositions of a part of the amino acids and a part of the auto-correlation functions are selected as the regression terms, which lead to the least prediction error. The resubstitution test shows that the average absolute errors are 0.052 and 0.047 with the standard deviations 0.050 and 0.047 for the prediction of helix/strand content, respectively. A rigorous cross-validation test, the jackknife test shows that the average absolute errors are 0.058 and 0.053 with the standard deviations 0.057 and 0.053 for the prediction of helix/strand content, respectively. Both tests indicate the self-consistency and the extrapolating effectiveness of the new method. The high prediction accuracy means that the method is suitable for practical applications.

[1]  Ming Yan,et al.  Prediction of the helix/strand content of globular proteins based on their primary sequences. , 1998, Protein engineering.

[2]  G. Fasman Prediction of Protein Structure and the Principles of Protein Conformation , 2012, Springer US.

[3]  K. Chou A novel approach to predicting protein structural classes in a (20–1)‐D amino acid composition space , 1995, Proteins.

[4]  H. Cid,et al.  Hydrophobicity and structural classes in proteins. , 1992, Protein engineering.

[5]  H A Scheraga,et al.  Influence of water on protein structure. An analysis of the preferences of amino acid residues for the inside or outside and for specific conformations in a protein molecule. , 1978, Macromolecules.

[6]  K Nishikawa,et al.  Prediction of the surface-interior diagram of globular proteins by an empirical method. , 2009, International journal of peptide and protein research.

[7]  P Argos,et al.  Engineering protein thermal stability. Sequence statistics point to residue substitutions in alpha-helices. , 1989, Journal of molecular biology.

[8]  K Nishikawa,et al.  The folding type of a protein is relevant to the amino acid composition. , 1986, Journal of biochemistry.

[9]  P. Klein,et al.  Prediction of protein structural class by discriminant analysis. , 1986, Biochimica et biophysica acta.

[10]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[11]  N. Sreerama,et al.  Protein secondary structure from circular dichroism spectroscopy. Combining variable selection principle and cluster analysis with neural network, ridge regression and self-consistent methods. , 1994, Journal of molecular biology.

[12]  P. Y. Chou,et al.  Prediction of protein conformation. , 1974, Biochemistry.

[13]  J. Gibrat,et al.  Secondary structure prediction: combination of three different methods. , 1988, Protein engineering.

[14]  R Langridge,et al.  Improvements in protein secondary structure prediction by an enhanced neural network. , 1990, Journal of molecular biology.

[15]  Z Zhang,et al.  Prediction of the Secondary Structure Contents of Globular Proteins Based on Three Structural Classes , 1998, Journal of protein chemistry.

[16]  H. Bull,et al.  Surface tension of amino acid solutions: a hydrophobicity scale of the amino acid residues. , 1974, Archives of biochemistry and biophysics.

[17]  Steven M. Muskal,et al.  Predicting protein secondary structure content. A tandem neural network approach. , 1992, Journal of molecular biology.

[18]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[19]  P. Ponnuswamy,et al.  Hydrophobic packing and spatial arrangement of amino acid residues in globular proteins. , 1980, Biochimica et biophysica acta.

[20]  J. Richardson,et al.  Principles and Patterns of Protein Conformation , 1989 .

[21]  P. Aloy,et al.  Relation between amino acid composition and cellular location of proteins. , 1997, Journal of molecular biology.

[22]  J M Thornton,et al.  Analysis of domain structural class using an automated class assignment protocol. , 1996, Journal of molecular biology.

[23]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[24]  Chris Sander,et al.  How to determine protein secondary structure in solution by Raman spectroscopy: practical guide and test case DNase I , 1989 .

[25]  Guo-Ping Zhou,et al.  An Intriguing Controversy over Protein Structural Class Prediction , 1998, Journal of protein chemistry.

[26]  P Argos,et al.  Prediction of secondary structural content of proteins from their amino acid composition alone. I. New analytic vector decomposition methods , 1996, Proteins.

[27]  K Nishikawa,et al.  Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies. , 1994, Journal of molecular biology.

[28]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[29]  M. Kanehisa,et al.  Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins. , 1996, Protein engineering.

[30]  P. Argos,et al.  Protein structure prediction: recognition of primary, secondary, and tertiary structural features from amino acid sequence. , 1995, Critical reviews in biochemistry and molecular biology.

[31]  W. R. Krigbaum,et al.  Prediction of the amount of secondary structure in a globular protein from its aminoacid composition. , 1973, Proceedings of the National Academy of Sciences of the United States of America.

[32]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[33]  S. Wold,et al.  Principal property values for six non-natural amino acids and their application to a structure–activity relationship for oxytocin peptide analogues , 1987 .

[34]  R. Hodges,et al.  New hydrophilicity scale derived from high-performance liquid chromatography peptide retention data: correlation of predicted surface residues with antigenicity and X-ray-derived accessible sites. , 1986, Biochemistry.

[35]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.

[36]  C. DeLisi,et al.  Hydrophobicity scales and computational techniques for detecting amphipathic structures in proteins. , 1987, Journal of molecular biology.

[37]  D. Eisenberg,et al.  Correlation of sequence hydrophobicities measures similarity in three-dimensional protein structure. , 1983, Journal of molecular biology.

[38]  G M Maggiora,et al.  Domain structural class prediction. , 1998, Protein engineering.