An analytical model of gene evolution with six mutation parameters: An application to archaeal circular codes

We develop here an analytical evolutionary model based on a trinucleotide mutation matrix 64 x 64 with six substitution parameters associated with the transitions and transversions in the three trinucleotide sites. It generalizes the previous models based on the nucleotide mutation matrices 4 x 4 and the trinucleotide mutation matrix 64 x 64 with three parameters. It determines at some time t the exact occurrence probabilities of trinucleotides mutating randomly according to six substitution parameters. An application of this model allows an evolutionary study of the common circular code COM and the 15 archaeal circular codes X which have been recently identified in several archaeal genomes. The main property of a circular code is the retrieval of the reading frames in genes, both locally, i.e. anywhere in genes and in particular without a start codon, and automatically with a window of a few nucleotides. In genes, the circular code is superimposed on the traditional genetic one. Very unexpectedly, the evolutionary model demonstrates that the archaeal circular codes can derive from the common circular code subjected to random substitutions with particular values for six substitutions parameters. It has a strong correlation with the statistical observations of three archaeal codes in actual genes. Furthermore, the properties of these substitution rates allow proposal of an evolutionary classification of the 15 archaeal codes into three main classes according to this model. In almost all the cases, they agree with the actual degeneracy of the genetic code with substitutions more frequent in the third trinucleotide site and with transitions more frequent that transversions in any trinucleotide site.

[1]  Umberto Eco,et al.  Theory of Codes , 1976 .

[2]  H. Akashi,et al.  Translational selection and molecular evolution. , 1998, Current opinion in genetics & development.

[3]  C J Michel,et al.  A complementary circular code in the protein coding genes. , 1996, Journal of theoretical biology.

[4]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[5]  S. Jeffery Evolution of Protein Molecules , 1979 .

[6]  M. Gouy,et al.  Codon catalog usage and the genome hypothesis. , 1980, Nucleic acids research.

[7]  Ming D. Li,et al.  Correlations Between mRNA Expression Levels and GC Contents of Coding and Untranslated Regions of Genes in Rodents , 2002, Journal of Molecular Evolution.

[8]  Gabriel Frey,et al.  Circular codes in archaeal genomes. , 2003, Journal of theoretical biology.

[9]  Dmitrij Frishman,et al.  The genome sequence of the thermoacidophilic scavenger Thermoplasma acidophilum , 2000, Nature.

[10]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[11]  Christian J. Michel,et al.  An evolutionary analytical model of a complementary circular code simulating the protein coding genes, the 5′ and 3′ regions , 1998, Bulletin of mathematical biology.

[12]  D. Kendall Applied Probability , 1958, Nature.

[13]  S. Karlin,et al.  Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Howard Ochman,et al.  Neutral mutations and neutral substitutions in bacterial genomes. , 2003, Molecular biology and evolution.

[15]  C J Michel,et al.  Analysis of a circular code model. , 2001, Journal of theoretical biology.

[16]  O. Berg,et al.  Codon bias in Escherichia coli: the influence of codon context on mutation and selection. , 1997, Nucleic acids research.

[17]  E. G. Shpaer Constraints on codon context in Escherichia coli genes. Their possible role in modulating the efficiency of translation. , 1986, Journal of molecular biology.

[18]  Sudhir Kumar,et al.  Patterns of transitional mutation biases within and among mammalian genomes. , 2003, Molecular biology and evolution.

[19]  P. Sharp,et al.  Codon usage and genome evolution. , 1994, Current opinion in genetics & development.

[20]  R. Bernander Chromosome replication, nucleoid segregation and cell division in archaea. , 2000, Trends in microbiology.

[21]  T. Ikemura Codon usage and tRNA content in unicellular and multicellular organisms. , 1985, Molecular biology and evolution.

[22]  P. Forterre,et al.  Genomics and early cellular evolution. The origin of the DNA world. , 2001, Comptes rendus de l'Academie des sciences. Serie III, Sciences de la vie.

[23]  T. Jukes,et al.  Silent nucleotide substitutions and G+C content of some mitochondrial and bacterial genes , 2005, Journal of Molecular Evolution.

[24]  Darren A. Natale,et al.  The complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[25]  M. Ermolaeva,et al.  Synonymous codon usage in bacteria. , 2001, Current issues in molecular biology.

[26]  C. Woese Interpreting the universal phylogenetic tree. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[27]  H. Munro,et al.  Mammalian protein metabolism , 1964 .