The Standard Genetic Code can Evolve from a Two-Letter GC Code Without Information Loss or Costly Reassignments

It is widely agreed that the standard genetic code must have been preceded by a simpler code that encoded fewer amino acids. How this simpler code could have expanded into the standard genetic code is not well understood because most changes to the code are costly. Taking inspiration from the recently synthesized six-letter code, we propose a novel hypothesis: the initial genetic code consisted of only two letters, G and C, and then expanded the number of available codons via the introduction of an additional pair of letters, A and U. Various lines of evidence, including the relative prebiotic abundance of the earliest assigned amino acids, the balance of their hydrophobicity, and the higher GC content in genome coding regions, indicate that the original two nucleotides were indeed G and C. This process of code expansion probably started with the third base, continued with the second base, and ended up as the standard genetic code when the second pair of letters was introduced into the first base. The proposed process is consistent with the available empirical evidence, and it uniquely avoids the problem of costly code changes by positing instead that the code expanded its capacity via the creation of new codons with extra letters.

[1]  D. Deamer,et al.  Hydrothermal Conditions and the Origin of Cellular Life. , 2015, Astrobiology.

[2]  Matteo Fumagalli,et al.  Both selective and neutral processes drive GC content evolution in the human genome , 2008, BMC Evolutionary Biology.

[3]  T. Ghosh,et al.  Codon degeneracy and amino acid abundance influence the measures of codon usage bias: improved Nc (N̂c) and ENCprime (N̂′c) measures , 2017, Genes to cells : devoted to molecular & cellular mechanisms.

[4]  Laura F. Landweber,et al.  Rewiring the keyboard: evolvability of the genetic code , 2001, Nature Reviews Genetics.

[5]  M. Frank-Kamenetskii,et al.  Base-stacking and base-pairing contributions into thermal stability of the DNA double helix , 2006, Nucleic acids research.

[6]  S. Freeland,et al.  The Case for an Error Minimizing Standard Genetic Code , 2003, Origins of life and evolution of the biosphere.

[7]  K. Karrer,et al.  Analysis of Genomic G + C Content, Codon Usage, Initiator Codon Context and Translation Termination Sites In Tetrahymena Thermophila , 1999, The Journal of eukaryotic microbiology.

[8]  Irene A. Chen,et al.  The RNA World as a Model System to Study the Origin of Life , 2015, Current Biology.

[9]  Eric Smith,et al.  A mechanism for the association of amino acids with their codons and the origin of the genetic code. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Peter G Schultz,et al.  Adding amino acids to the genetic repertoire. , 2005, Current opinion in chemical biology.

[11]  Raymond F. Gesteland,et al.  The 22nd Amino Acid , 2002, Science.

[12]  Massimo Di Giulio,et al.  An extension of the coevolution theory of the origin of the genetic code. , 2008 .

[13]  F. Crick Codon--anticodon pairing: the wobble hypothesis. , 1966, Journal of molecular biology.

[14]  Eugene V Koonin,et al.  Origin and evolution of the genetic code: The universal enigma , 2008, IUBMB life.

[15]  Manuel A. S. Santos,et al.  Non-Standard Genetic Codes Define New Concepts for Protein Engineering , 2015, Life.

[16]  Francis Crick,et al.  Codon--anticodon pairing: the wobble hypothesis. , 1966, Journal of Molecular Biology.

[17]  H. Larralde,et al.  Translocation properties of primitive molecular machines and their relevance to the structure of the genetic code. , 2002, Journal of theoretical biology.

[18]  Aaron W Feldman,et al.  A Semi-Synthetic Organism that Stores and Retrieves Increased Genetic Information , 2017, Nature.

[19]  Eric Smith,et al.  The Origin and Nature of Life on Earth: The Emergence of the Fourth Geosphere , 2016 .

[20]  Alfonso Jiménez-Sánchez,et al.  On the origin and evolution of the genetic code , 1995, Journal of Molecular Evolution.

[21]  C. Woese,et al.  Evolution of the genetic code , 2004, The Science of Nature.

[22]  N. Lehman,et al.  The RNA World: molecular cooperation at the origins of life , 2014, Nature Reviews Genetics.

[23]  B. Damer,et al.  Coupled Phases and Combinatorial Selection in Fluctuating Hydrothermal Pools: A Scenario to Guide Experimental Approaches to the Origin of Cellular Life , 2015, Life.

[24]  E. Anders,et al.  Origin of organic matter in early solar system—III. Amino acids: Catalytic synthesis , 1971 .

[25]  Eugene V Koonin,et al.  Origin and Evolution of the Universal Genetic Code. , 2017, Annual review of genetics.

[26]  M. Gorovsky,et al.  Tetrahymena thermophila , 2005, Current Biology.

[27]  K. Ikehara,et al.  Origin and evolutionary process of the genetic code. , 2007, Current medicinal chemistry.

[28]  Malcolm R. Walter,et al.  Earliest signs of life on land preserved in ca. 3.5 Ga hot spring deposits , 2017, Nature Communications.

[29]  Apoorva D. Patel,et al.  The Triplet Genetic Code had a Doublet Predecessor , 2004, Journal of theoretical biology.

[30]  P. Higgs A four-column theory for the origin of the genetic code: tracing the evolutionary pathways that gave rise to an optimized code , 2009, Biology Direct.

[31]  J. Wong,et al.  Coevolution theory of the genetic code at age thirty. , 2005, BioEssays : news and reviews in molecular, cellular and developmental biology.

[32]  Nathaniel Virgo,et al.  An iterated learning approach to the origins of the standard genetic code can help to explain its sequence of amino acid assignments , 2018 .

[33]  E. Szathmáry,et al.  On origin of genetic code and tRNA before translation , 2011, Biology Direct.

[34]  P. Higgs,et al.  A thermodynamic basis for prebiotic amino acid synthesis and the nature of the first genetic code. , 2009, Astrobiology.

[35]  Nathaniel Virgo,et al.  Horizontal transfer of code fragments between protocells can explain the origins of the genetic code without vertical descent , 2018, Scientific Reports.

[36]  L. H. Bradley,et al.  Protein design by binary patterning of polar and nonpolar amino acids. , 1993, Methods in molecular biology.

[37]  Reijer Lenstra,et al.  The Graph, Geometry and Symmetries of the Genetic Code with Hamming Metric , 2015, Symmetry.

[38]  A. Travers The Evolution of the Genetic Code Revisited , 2007, Origins of Life and Evolution of Biospheres.