Prediction of the surface-interior diagram of globular proteins by an empirical method.

The number of amino acid residues in contact with a residue in a globular protein is a simple and good measure to show the relative location of the residue on the surface or in the interior of the protein. The contact number is estimated as the number of C alpha atoms within a sphere of radius r (8 A) centered at the C alpha atom of a given residue. The prediction of a diagram (the plot of the contact number against the residue number) from a given amino acid sequence may be meaningful as an alternative to the secondary-structure prediction currently performed. Parameter values are determined empirically using the observed contact numbers calculated from known structures of 39 proteins. In order to assess the real efficiency of the method, the prediction has been performed in the following way; all the proteins are divided into two groups; one group is used to derive parameter sets and the other serves to test the prediction accuracy. The test reveals that the parameter sets empirically determined are biased significantly towards the data base, the extent of which is roughly proportional to the number of parameter terms included. The results show that an adequate smoothing of a parameter set is the best way to reduce the extent of biasing towards the data base and to give the best prediction for 'unknown' proteins. The prediction accuracy finally obtained is about 0.4 (or roughly 70%), on the average, measured by the correlation coefficient between the predicted and observed diagrams. This value is of the same order as the accuracy in the current predictions of secondary structures.

[1]  V. Lim Algorithms for prediction of α-helical and β-structural regions in globular proteins , 1974 .

[2]  G. Crippen,et al.  Correlation of sequence and tertiary structure in globular proteins , 1977, Biopolymers.

[3]  J. Janin,et al.  Surface and inside volumes in globular proteins , 1979, Nature.

[4]  M. Oobatake,et al.  An analysis of non-bonded energy of proteins. , 1977, Journal of theoretical biology.

[5]  B. Lee,et al.  The interpretation of protein structures: estimation of static accessibility. , 1971, Journal of molecular biology.

[6]  D. Wetlaufer Nucleation, rapid folding, and globular intrachain regions in proteins. , 1973, Proceedings of the National Academy of Sciences of the United States of America.

[7]  George D. Rose,et al.  Prediction of chain turns in globular proteins on a hydrophobic basis , 1978, Nature.

[8]  A. Shrake,et al.  Environment and exposure to solvent of protein atoms. Lysozyme and insulin. , 1973, Journal of molecular biology.

[9]  B. Robson,et al.  Conformational properties of amino acid residues in globular proteins. , 1976, Journal of molecular biology.

[10]  J. Lenstra Evaluation of secondary structure predictions in proteins. , 1977, Biochimica et biophysica acta.

[11]  C. Chothia The nature of the accessible and buried surfaces in proteins. , 1976, Journal of molecular biology.

[12]  S. Miyazawa,et al.  Volume and polarity changes accompanied by amino acid substitutions in protein evolution. , 2009, International journal of peptide and protein research.

[13]  P. Y. Chou,et al.  Prediction of protein conformation. , 1974, Biochemistry.

[14]  M. Yčas,et al.  On the computation of the tertiary structure of globular proteins II. , 1978, Journal of theoretical biology.

[15]  Janet M. Thornton,et al.  Prediction of protein structure from amino acid sequence , 1978, Nature.

[16]  H. Scheraga,et al.  Experimental and theoretical aspects of protein folding. , 1975, Advances in protein chemistry.

[17]  Statistical mechanical treatment of protein conformation. III. Prediction of protein conformation based on a three-state model. , 1976, Macromolecules.

[18]  K. Nagano,et al.  Logical analysis of the mechanism of protein folding II. The nucleation process. , 1974, Journal of molecular biology.

[19]  P K Ponnuswamy,et al.  A study of the preferred environment of amino acid residues in globular proteins. , 1977, Archives of biochemistry and biophysics.

[20]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[21]  H A Scheraga,et al.  Improvements in the prediction of protein backbone topography by reduction of statistical errors. , 1979, Biochemistry.

[22]  J. Garnier,et al.  Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. , 1978, Journal of molecular biology.

[23]  H A Scheraga,et al.  Influence of water on protein structure. An analysis of the preferences of amino acid residues for the inside or outside and for specific conformations in a protein molecule. , 1978, Macromolecules.

[24]  C. Tanford,et al.  The solubility of amino acids and two glycine peptides in aqueous ethanol and dioxane solutions. Establishment of a hydrophobicity scale. , 1971, The Journal of biological chemistry.

[25]  M. Levitt,et al.  Automatic identification of secondary structure in globular proteins. , 1977, Journal of molecular biology.

[26]  P Argos,et al.  An assessment of protein secondary structure prediction methods based on amino acid sequence. , 1976, Biochimica et biophysica acta.