G and T nucleotide contents show specie-invariant negative correlation for all three codon positions.

The nucleotide contents of the three codon positions show a number of statistical pairwise correlations, some of which are universal for all analysed genomes. Among the most prominent of these correlations are negative correlations between G and T contents found in genes of all species analysed. The pair A/C, which is complementary to G/T shows similar negative correlation in genes of most species. In the genes of several species including all mammalian genes studied, positive correlations between A and T contents, and G and C contents are found. Since these regularities are observed in all three codon positions they are connected with amino-acid content of proteins. Such correlations may origin from features of the mutation process or/and translation reading frame check. The well-known bias of the preference for G in the first codon position and its deficiency in the second is accompanied by opposite bias in T content. In the third codon position there is no general nucleotide preference, but its content is often biased with regard to GC content of the gene. G and T contents in this case are always shifted in the opposite directions Several ideas are drawn to explain this preference.

[1]  A. Suyama,et al.  Third letters in codons counterbalance the (G + C)‐content of their first and second letters , 1985 .

[2]  E. Trifonov Translation framing code and frame-monitoring mechanism as suggested by the analysis of mRNA and 16 S rRNA nucleotide sequences. , 1987, Journal of molecular biology.

[3]  A J Cuticchia,et al.  Mono- through hexanucleotide composition of the sense strand of yeast DNA: a Markov chain analysis. , 1988, Nucleic acids research.

[4]  P. Sharp,et al.  Regional base composition variation along yeast chromosome III: evolution of chromosome primary structure. , 1993, Nucleic acids research.

[5]  M. Saier Differential codon usage: a safeguard against inappropriate expression of specialized genes? , 1995, FEBS letters.

[6]  K. Ikehara,et al.  Unusually biased nucleotide sequences on sense strands of Flavobacterium sp. genes produce nonstop frames on the corresponding antisense strands. , 1993, Nucleic acids research.

[7]  G Bernardi,et al.  A universal compositional correlation among codon positions. , 1992, Gene.

[8]  C. Zhang,et al.  A graphic approach to analyzing codon usage in 1562 Escherichia coli protein coding sequences. , 1994, Journal of molecular biology.

[9]  B. Berkhout,et al.  The unusual nucleotide content of the HIV RNA genome results in a biased amino acid composition of HIV proteins. , 1994, Nucleic acids research.

[10]  C. Gautier,et al.  Hydrophobicity, expressivity and aromaticity are the major trends of amino-acid usage in 999 Escherichia coli chromosome-encoded genes. , 1994, Nucleic acids research.

[11]  G S Mani Long-range doublet correlations in DNA and the coding regions. , 1992, Journal of theoretical biology.

[12]  K. Umesono,et al.  Directional mutation pressure and transfer RNA in choice of the third nucleotide of synonymous two-codon sets. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[13]  V. Chechetkin,et al.  Three-quasiperiodicity, mutual correlations, ordering and long-range modulations in genomic nucleotide sequences for viruses. , 1994, Journal of biomolecular structure & dynamics.

[14]  P. Sharp,et al.  Codon usage and genome evolution. , 1994, Current opinion in genetics & development.

[15]  Y. Diaz-Lazcoz,et al.  Differential codon usage for conserved amino acids: evidence that the serine codons TCN were primordial. , 1995, Journal of molecular biology.

[16]  J. C. Shepherd Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[17]  R. Ivarie,et al.  Mono- through hexanucleotide composition of the Escherichia coli genome: a Markov chain analysis. , 1987, Nucleic acids research.

[18]  M. Gouy,et al.  Codon usage in bacteria: correlation with gene expressivity. , 1982, Nucleic acids research.

[19]  P Lió,et al.  Third codon G + C periodicity as a possible signal for an "internal" selective constraint. , 1994, Journal of theoretical biology.

[20]  G Bernardi,et al.  The mosaic genome of warm-blooded vertebrates. , 1985, Science.