Simple procedures are proposed to quantify how much an effective property embodied in a given ranking of the twenty amino acids can be affected by random point mutations at nucleotide bases. As expected, of the various orderings tested, rankings based on most hydrophobicity scales exhibit low scores, thus offering better immunity towards such single-base mutations. This, however, occurs to different extents and the method allows sharp discriminations between the scales. Hydrophobicity scales based on global properties such as spatial environment data of proteins residues, or mutation matrices of amino acid replacements, generally behave better than those based on pure physicochemical properties of isolated residues. An averaged scale built from the available hydrophobicity scales exhibits one of the most favorable scores. A systematic search for the best amino acid order has been carried out across all possible scales. Optimized scales are characterized by the existence of a clustering scheme into three zones, within which permutations are more or less tolerated, depending on the zone and on the summation procedure used in the score calculation. The first cluster corresponds to the hydrophobic side, and includes the ten amino acids WMCFILVGRS. Next follows the ATP triad. The third cluster coincides with the hydrophilic side and includes, in the last seven positions, the amino acids EDKNQHY. Interpretation of these optimized scales in terms of codon positions in the genetic code further suggests a clustering scheme composed of four groups, WMCFILV-GRS-ATP-EDKNQHY, emphasizing the role of the second base as the main driving parameter. As a consequence, the conserved character of the genetic code is better reflected when it is displayed in UGCA ordering rather than in the commonly used UCAG ordering. The present a priori classification of the amino acids could find potential use in protein sequence homology and structure prediction.
[1]
William H. Press,et al.
Numerical Recipes in FORTRAN - The Art of Scientific Computing, 2nd Edition
,
1987
.
[2]
M. O. Dayhoff,et al.
Atlas of protein sequence and structure
,
1965
.
[3]
D. Eisenberg,et al.
Hydrophobic moments and protein structure
,
1982
.
[4]
William H. Press,et al.
Numerical Recipes: FORTRAN
,
1988
.
[5]
C. Woese,et al.
On the fundamental nature and evolution of the genetic code.
,
1966,
Cold Spring Harbor symposia on quantitative biology.
[6]
M. Gribskov,et al.
Sequence Analysis Primer
,
1991
.