An Extended RNA Code and its Relationship to the Standard Genetic Code: An Algebraic and Geometrical Approach

An algebraic and geometrical approach is used to describe the primaeval RNA code and a proposed Extended RNA code. The former consists of all codons of the type RNY, where R means purines, Y pyrimidines, and N any of them. The latter comprises the 16 codons of the type RNY plus codons obtained by considering the RNA code but in the second (NYR type), and the third, (YRN type) reading frames. In each of these reading frames, there are 16 triplets that altogether complete a set of 48 triplets, which specify 17 out of the 20 amino acids, including AUG, the start codon, and the three known stop codons. The other 16 codons, do not pertain to the Extended RNA code and, constitute the union of the triplets YYY and RRR that we define as the RNA-less code. The codons in each of the three subsets of the Extended RNA code are represented by a four-dimensional hypercube and the set of codons of the RNA-less code is portrayed as a four-dimensional hyperprism. Remarkably, the union of these four symmetrical pairwise disjoint sets comprises precisely the already known six-dimensional hypercube of the Standard Genetic Code (SGC) of 64 triplets. These results suggest a plausible evolutionary path from which the primaeval RNA code could have originated the SGC, via the Extended RNA code plus the RNA-less code. We argue that the life forms that probably obeyed the Extended RNA code were intermediate between the ribo-organisms of the RNA World and the last common ancestor (LCA) of the Prokaryotes, Archaea, and Eucarya, that is, the cenancestor. A general encoding function, E, which maps each codon to its corresponding amino acid or the stop signal is also derived. In 45 out of the 64 cases, this function takes the form of a linear transformation F, which projects the whole six-dimensional hypercube onto a four-dimensional hyperface conformed by all triplets that end in cytosine. In the remaining 19 cases the function E adopts the form of an affine transformation, i.e., the composition of F with a particular translation. Graphical representations of the four local encoding functions and E, are illustrated and discussed. For every amino acid and for the stop signal, a single triplet, among those that specify it, is selected as a canonical representative. From this mapping a graphical representation of the 20 amino acids and the stop signal is also derived. We conclude that the general encoding function E represents the SGC itself.

[1]  M. Eigen,et al.  The Hypercycle , 2004, Naturwissenschaften.

[2]  T. Govezensky,et al.  Statistical analysis of the distribution of amino acids in Borrelia burgdorferi genome under different genetic codes , 2004, q-bio/0403032.

[3]  Eörs Szathmáry,et al.  Life: In search of the simplest cell , 2005, Nature.

[4]  Sergei V. Petoukhov,et al.  Genetic code, hamming distance and stochastic matrices , 2004, Bulletin of mathematical biology.

[5]  M. Eigen,et al.  Emergence of the Hypercycle , 1979 .

[6]  Stanley L. Miller,et al.  The Origin and Early Evolution of Life: Prebiotic Chemistry, the Pre-RNA World, and Time , 1996, Cell.

[7]  Andrés Moya,et al.  Extreme genome reduction in Buchnera spp.: Toward the minimal genome needed for symbiotic life , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Andrew D. Ellington,et al.  The search for missing links between self-replicating nucleic ACIDs and the RNA world , 1995, Origins of life and evolution of the biosphere.

[9]  F H Crick The genetic code. 3. , 1966, Scientific American.

[10]  Peter Schuster,et al.  A principle of natural self-organization , 1977, Naturwissenschaften.

[11]  M. Eigen,et al.  Pattern analysis of 5S rRNA. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[12]  B. Ganem RNA world , 1987, Nature.

[13]  W. Gilbert Origin of life: The RNA world , 1986, Nature.

[14]  M. Eigen,et al.  The Hypercycle: A principle of natural self-organization , 2009 .

[15]  T Pöschel,et al.  The hypercube structure of the genetic code explains conservative and non-conservative aminoacid substitutions in vivo and in vitro. , 2002, Bio Systems.

[16]  F. Crick Origin of the Genetic Code , 1967, Nature.

[17]  E. Koonin,et al.  A minimal gene set for cellular life derived by comparison of complete bacterial genomes. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Francis Crick,et al.  The Genetic Code , 1962 .

[19]  Robersy Sanchez,et al.  The Genetic Code Boolean Lattice , 2004, q-bio/0412034.

[20]  L. Hurst,et al.  The Genetic Code Is One in a Million , 1998, Journal of Molecular Evolution.

[21]  A. Lazcano,et al.  Polyphyletic gene losses can bias backtrack characterizations of the cenancestor , 1997, Journal of Molecular Evolution.

[22]  M. Eigen,et al.  The hypercycle. A principle of natural self-organization. Part A: Emergence of the hypercycle. , 1977, Die Naturwissenschaften.

[23]  O. White,et al.  Global transposon mutagenesis and a minimal Mycoplasma genome. , 1999, Science.

[24]  F. Jacob,et al.  Evolution and tinkering. , 1977, Science.

[25]  A. Waiss,et al.  Larvicidal factors contributing to host-plant resistance against sunflower moth , 1977, Naturwissenschaften.

[26]  F H Crick,et al.  CODES WITHOUT COMMAS. , 1957, Proceedings of the National Academy of Sciences of the United States of America.

[27]  B. Moss,et al.  Role of DNA replication in vaccinia virus gene expression: A naked template is required for transcription of three late trans-activator genes , 1990, Cell.

[28]  R. Sanchez,et al.  A genetic code Boolean structure. I. The meaning of Boolean deductions , 2005, Bulletin of mathematical biology.