论文信息 - A Prossible Code in the Genetic Code

A Prossible Code in the Genetic Code

In order to analyse the genetic code, the distribution of the 64 trinucleotides w (words of 3 letters on the gene alphabet {A,C,G,T}, w∈τ={AAA,⋯,TTT}) in the prokaryotic protein coding genes (words of large sizes) is studied with autocorrelation functions. The trinucleotides wp can be read in 3 frames p (p=0: reference frame, p=1: reference frame shifted by 1 letter, p=2: reference frame shifted by 2 letters) in coding genes. Then, the autocorrelation function wp(N)iw′ analyses the occurrence probability of the i-motif wp(N)iw′, i.e. 2 trinucleotides wp in frame p and w′ in any frame (w,w′∈ τ) which are separated by any i bases N (N=A, C, G or T). The 642×3=12288 autocorrelation functions applied to the prokaryotic protein coding genes are almost all non-random and have a modulo 3 periodicity among the 3 following types: 0 modulo 3, 1 modulo 3 and 2 modulo 3. The classification of 12288 i-motifs wp(N)iw′ according to the type of periodicity implies a constant preferential occurrence frame for w′ independent of w and p. Three sub-sets of trinucleotides are identified: 22 trinucleotides in frame 0 forming the subset τ0={AAA, AAC, AAT, ACC, ATC, ATT, CAG, CTC, CTG, GAA, GAC, GAG, GAT, GCC, GGC, GGT, GTA, GTC, GTT, TAC, TTC, TTT} and 21 trinucleotides in each of the frames 1 and 2 forming the sub-sets τ1 and τ2 respectively. Except for AAA, CCC, GGG and TTT, the sub-sets τ1 and τ2 are generated by a circular permutation P of τ0: P(τ0)=τ1 and P(τ1)=τ2. Furthermore, the complementarity property ∁ of the DNA double helix (i.e. ∁(A)=T, ∁(C)=G, ∁(G)=C, ∁(T)=A and if w=l1l2l3 then ∁(w)=∁(l3)∁(l2)∁(l1) with l1, l2, l3∈{A,C,G,T}) is observed in these 3 sub-sets: ∁(τ0)=τ0, ∁(τ1)=τ2 and ∁(τ2)=τ1.

Didier Arquès | Christian J. Michel

[1] C J Michel,et al. Identification and simulation of shifted periodicities common to protein coding genes of eukaryotes, prokaryotes and viruses. , 1995, Journal of theoretical biology.

[2] F H Crick,et al. CODES WITHOUT COMMAS. , 1957, Proceedings of the National Academy of Sciences of the United States of America.

[3] F. Crick,et al. Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid , 1953, Nature.

[4] D. Arquès,et al. Periodicities in coding and noncoding regions of the genes. , 1990, Journal of theoretical biology.

[5] C J Michel,et al. A purine-pyrimidine motif verifying an identical presence in almost all gene taxonomic groups. , 1987, Journal of theoretical biology.

[6] C J Michel,et al. A model of DNA sequence evolution. , 1990, Bulletin of mathematical biology.

[7] T H Jukes,et al. Amino acid composition of proteins: Selection against the genetic code. , 1975, Science.

[8] Marshall W. Nirenberg,et al. The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides , 1961, Proceedings of the National Academy of Sciences.