A colorful origin for the genetic code: Information theory, statistical mechanics and the emergence of molecular codes

The genetic code maps the sixty-four nucleotide triplets (codons) to twenty amino-acids. While the biochemical details of this code were unraveled long ago, its origin is still obscure. We review information-theoretic approaches to the problem of the code's origin and discuss the results of a recent work that treats the code in terms of an evolving, error-prone information channel. Our model - which utilizes the rate-distortion theory of noisy communication channels - suggests that the genetic code originated as a result of the interplay of the three conflicting evolutionary forces: the needs for diverse amino-acids, for error-tolerance and for minimal cost of resources. The description of the code as an information channel allows us to mathematically identify the fitness of the code and locate its emergence at a second-order phase transition when the mapping of codons to amino-acids becomes nonrandom. The noise in the channel brings about an error-graph, in which edges connect codons that are likely to be confused. The emergence of the code is governed by the topology of the error-graph, which determines the lowest modes of the graph-Laplacian and is related to the map coloring problem.

[1]  Serge Massar,et al.  Optimality of the genetic code with respect to protein stability and amino-acid frequencies , 2001, Genome Biology.

[2]  J. Leydold,et al.  Discrete Nodal Domain Theorems , 2000, math/0009120.

[3]  Tsvi Tlusty,et al.  A rate-distortion scenario for the emergence and evolution of noisy molecular codes , 2008, Physical review letters.

[4]  R. Wolfenden,et al.  Water, protein folding, and the genetic code. , 1979, Science.

[5]  Shōzō Ōsawa,et al.  Evolution of the genetic code , 1995 .

[6]  Tsvi Tlusty,et al.  A model for the emergence of the genetic code as a transition in a noisy information channel , 2007, Journal of theoretical biology.

[7]  B. K. Davis Evolution of the genetic code. , 1999, Progress in biophysics and molecular biology.

[8]  W. Fitch,et al.  The phylogeny of tRNA sequences provides evidence for ambiguity reduction in the origin of the genetic code. , 1987, Cold Spring Harbor symposia on quantitative biology.

[9]  S. Rodin,et al.  On the origin of the genetic code: signatures of its primordial complementarity in tRNAs and aminoacyl-tRNA synthetases , 2008, Heredity.

[10]  Guy Sella,et al.  No accident: genetic codes freeze in error-correcting patterns of the standard genetic code. , 2002, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[11]  Uri Alon,et al.  Coding limits on the number of transcription factors , 2006, BMC Genomics.

[12]  Kenji Ikehara,et al.  A Novel Theory on the Origin of the Genetic Code: A GNC-SNS Hypothesis , 2002, Journal of Molecular Evolution.

[13]  R. Bar-Ziv,et al.  High-fidelity DNA sensing by protein binding fluctuations. , 2004, Physical review letters.

[14]  Peter F. Stadler,et al.  On the Evolution of Primitive Genetic Codes , 2003, Origins of life and evolution of the biosphere.

[15]  Tsvi Tlusty,et al.  Optimal Design of a Molecular Recognizer: Molecular Recognition as a Bayesian Signal Detection Problem , 2008, IEEE Journal of Selected Topics in Signal Processing.

[16]  A. L. Weber,et al.  Genetic code correlations: Amino acids and their anticodon nucleotides , 1978, Journal of Molecular Evolution.

[17]  Joseph A. Krzycki,et al.  Pyrrolysine Encoded by UAG in Archaea: Charging of a UAG-Decoding Specialized tRNA , 2002, Science.

[18]  S. Osawa,et al.  Recent evidence for evolution of the genetic code , 1992, Microbiological reviews.

[19]  G. Wächtershäuser,et al.  Groundworks for an evolutionary biochemistry: the iron-sulphur world. , 1992, Progress in biophysics and molecular biology.

[20]  J. Wong A co-evolution theory of the genetic code. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[22]  Claudio Perez Tamargo Can one hear the shape of a drum , 2008 .

[23]  S. Osawa,et al.  On Codon reassignment , 1995, Journal of Molecular Evolution.

[24]  A. Figureau,et al.  Optimization and the genetic code , 2005, Origins of life and evolution of the biosphere.

[25]  Liaofu Luo,et al.  Coding Rules for Amino Acids in the Genetic Code: The Genetic Code is a Minimal Code of Mutational Deterioration , 2004, Origins of life and evolution of the biosphere.

[26]  Uri Alon,et al.  Rules for biological regulation based on error minimization. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[27]  J. Wong,et al.  Coevolution theory of the genetic code at age thirty. , 2005, BioEssays : news and reviews in molecular, cellular and developmental biology.

[28]  F. Crick Codon--anticodon pairing: the wobble hypothesis. , 1966, Journal of molecular biology.

[29]  C. Woese,et al.  Evolution of the genetic code , 2004, The Science of Nature.

[30]  L. Hurst,et al.  The Genetic Code Is One in a Million , 1998, Journal of Molecular Evolution.

[31]  E N Trifonov,et al.  Consensus temporal order of amino acids and evolution of the triplet code. , 2000, Gene.

[32]  V. Bryson,et al.  Evolving Genes and Proteins. , 1965, Science.

[33]  Wolfgang Kühnel,et al.  Tight Submanifolds, Smooth and Polyhedral , 1997 .

[34]  G. Sella,et al.  The Impact of Message Mutation on the Fitness of a Genetic Code , 2002, Journal of Molecular Evolution.

[35]  J. Lehmann,et al.  Physico-chemical constraints connected with the coding properties of the genetic system. , 2000, Journal of theoretical biology.

[36]  Eörs Szathmáry,et al.  The Major Transitions in Evolution , 1997 .

[37]  Nick Goldman,et al.  Further results on error minimization in the genetic code , 1993, Journal of Molecular Evolution.

[38]  T. Tlusty,et al.  High fidelity of RecA-catalyzed recombination: a watchdog of genetic diversity , 2006, Nucleic acids research.

[39]  Roland Pohlmeyer,et al.  The genetic code revisited. , 2008, Journal of theoretical biology.

[40]  Eörs Szathmáry,et al.  Codon swapping as a possible evolutionary mechanism , 1991, Journal of Molecular Evolution.

[41]  G W Hoffmann,et al.  On the origin of the genetic code and the stability of the translation apparatus. , 1974, Journal of molecular biology.

[42]  A. Goldberg,et al.  Genetic Code: Aspects of Organization , 1966, Science.

[43]  Y. Sanejouand,et al.  Which effective property of amino acids is best preserved by the genetic code? , 1998, Protein engineering.

[44]  C R Woese,et al.  The molecular basis for the genetic code. , 1966, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Stanley L. Miller,et al.  Reasons for the occurrence of the twenty coded protein amino acids , 1981, Journal of Molecular Evolution.

[46]  F. H. C. CRICK,et al.  Origin of the Genetic Code , 1967, Nature.

[47]  Stephen B. Jenkins Ode to code , 2008, CACM.

[48]  Trading codes for errors , 2008, Proceedings of the National Academy of Sciences.

[49]  Tsvi Tlusty The physical language of molecular codes: A rate-distortion approach to the evolution and emergence of biological codes , 2009, 2009 43rd Annual Conference on Information Sciences and Systems.

[50]  Rodrick Wallace,et al.  Metabolic Constraints on the Evolution of Genetic Codes: Did Multiple Preaerobic' Ecosystem Transitions Entrain Richer Dialects via Serial Endosymbiosis? , 2012, Trans. Comp. Sys. Biology.

[51]  M Yarus,et al.  RNA-ligand chemistry: a testable source for the genetic code. , 2000, RNA.

[52]  Jonathan L. Gross,et al.  Topological Graph Theory , 1987, Handbook of Graph Theory.

[53]  Tsvi Tlusty,et al.  Casting polymer nets to optimize noisy molecular codes , 2008, Proceedings of the National Academy of Sciences.

[54]  David H. Ardell,et al.  On Error Minimization in a Sequential Origin of the Standard Genetic Code , 1998, Journal of Molecular Evolution.

[55]  Tsvi Tlusty,et al.  Protein–DNA computation by stochastic assembly cascade , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[56]  Mario de Sampaio Ferraz,et al.  Campos do Jordão , 1936 .

[57]  J. D. Bernal,et al.  “The Origins of Life” , 1957, Nature.

[58]  P. Harrison,et al.  The structure of the mouse glutathione peroxidase gene: the selenocysteine in the active site is encoded by the ‘termination’ codon, TGA. , 1986, The EMBO journal.

[59]  John Maynard Smith,et al.  From replicators to reproducers: the first major transitions leading to life. , 1997, Journal of theoretical biology.

[60]  G Barreau,et al.  The evolutionary consequences of redundancy in natural and artificial genetic codes. , 1998 .

[61]  T. M. Sonneborn Degeneracy of the Genetic Code: Extent, Nature, and Genetic Implications , 1965 .

[62]  C. Epstein,et al.  Role of the Amino-Acid ‘Code’ and of Selection for Conformation in the Evolution of Proteins , 1966, Nature.

[63]  Tsvi Tlusty,et al.  Molecular recognition as an information channel: The role of conformational changes , 2009, 2009 43rd Annual Conference on Information Sciences and Systems.

[64]  Thomas Banchoff Tightly Embedded 2-Dimensional Polyhedral Manifolds , 1965 .

[65]  Aaron D. Wyner,et al.  Coding Theorems for a Discrete Source With a Fidelity CriterionInstitute of Radio Engineers, International Convention Record, vol. 7, 1959. , 1993 .

[66]  N. Wickramasinghe,et al.  Experimental studies on the origin of the genetic code and the process of protein synthesis: A review update , 1992, Origins of life and evolution of the biosphere.

[67]  Eörs Szathmáry,et al.  A statistical test of hypotheses on the organization and origin of the genetic code , 1992, Journal of Molecular Evolution.

[68]  Shiing-Shen Chern,et al.  Tight and Taut Submanifolds , 2011 .

[69]  Thomas Butler,et al.  Extreme genetic code optimality from a molecular dynamics calculation of amino acid polar requirement. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[70]  Eugene I. Shakhnovich,et al.  A First-Principles Model of Early Evolution: Emergence of Gene Families, Species, and Preferred Protein Folds , 2007, PLoS Comput. Biol..

[71]  L. Hurst,et al.  Early fixation of an optimal genetic code. , 2000, Molecular biology and evolution.

[72]  S. R. Pelc,et al.  Stereochemical Relationship Between Coding Triplets and Amino-Acids , 1966, Nature.

[73]  T. Jukes,et al.  Genetic code development by stop codon takeover. , 1988, Journal of theoretical biology.

[74]  John R. Jungck,et al.  The genetic code as a periodic table , 1978, Journal of Molecular Evolution.

[75]  R. Grantham Amino Acid Difference Formula to Help Explain Protein Evolution , 1974, Science.

[76]  Dónall A Mac Dónaill Why nature chose A, C, G and U/T: an error-coding perspective of nucleotide alphabet composition. , 2003, Origins of life and evolution of the biosphere : the journal of the International Society for the Study of the Origin of Life.

[77]  Tsvi Tlusty,et al.  A RELATION BETWEEN THE MULTIPLICITY OF THE SECOND EIGENVALUE OF A GRAPH LAPLACIAN, COURANT'S NODAL LINE THEOREM AND THE SUBSTANTIAL DIMENSION OF TIGHT POLYHEDRAL SURFACES ∗ , 2010, 1007.4132.

[78]  A.W.F. Edwards Picturing the genetic code , 2007 .

[79]  G. Ringel,et al.  Solution of the heawood map-coloring problem. , 1968, Proceedings of the National Academy of Sciences of the United States of America.

[80]  C R Woese,et al.  Order in the genetic code. , 1965, Proceedings of the National Academy of Sciences of the United States of America.

[81]  F. Taylor,et al.  The code within the codons. , 1989, Bio Systems.

[82]  Brian K Davis On mapping the genetic code. , 2009, Journal of theoretical biology.

[83]  L. Pauling,et al.  Evolutionary Divergence and Convergence in Proteins , 1965 .

[84]  T. Cover,et al.  Rate Distortion Theory , 2001 .

[85]  R. Swanson A unifying concept for the amino acid code. , 1984, Bulletin of mathematical biology.

[86]  Nediljko Budisa,et al.  Prolegomena to future experimental efforts on genetic code engineering by expanding its amino acid repertoire. , 2004, Angewandte Chemie.

[87]  N. Goldenfeld,et al.  Collective evolution and the genetic code. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[88]  Dr. M. G. Worster Methods of Mathematical Physics , 1947, Nature.

[89]  Massimo Di Giulio,et al.  The origin of the genetic code: theories and their relationships, a review. , 2005, Bio Systems.

[90]  Gregory Provan,et al.  Codon Size Reduction as the Origin of the Triplet Genetic Code , 2009, PloS one.

[91]  E. Szathmáry,et al.  The origin of the genetic code: amino acids as cofactors in an RNA world. , 1999, Trends in genetics : TIG.

[92]  Laura F. Landweber,et al.  Rewiring the keyboard: evolvability of the genetic code , 2001, Nature Reviews Genetics.

[93]  Dónall A. Mac Dónaill,et al.  Why Nature Chose A, C, G and U/T: An Error-Coding Perspective of Nucleotide Alphabet Composition , 2003, Origins of life and evolution of the biosphere.

[94]  Tsvi Tlusty,et al.  A simple model for the evolution of molecular codes driven by the interplay of accuracy, diversity and cost , 2008, Physical biology.

[95]  C. Alff-Steinberger,et al.  The genetic code and error transmission. , 1969, Proceedings of the National Academy of Sciences of the United States of America.

[96]  M. Pouzet,et al.  Genetic code and optimal resistance to the effects of mutations , 2004, Origins of life.

[97]  Guy Sella,et al.  On the Evolution of Redundancy in Genetic Codes , 2001, Journal of Molecular Evolution.

[98]  R. Curnow,et al.  The evolution of the genetic code. , 1976, Biochimie.

[99]  Vahe Bedian The possible role of assignment catalysts in the origin of the genetic code , 2004, Origins of life.

[100]  S. Fox A Theory of Macromolecular and Cellular Origins , 1965, Nature.

[101]  Laurence D. Hurst,et al.  A Quantitative Measure of Error Minimization in the Genetic Code , 1999, Journal of Molecular Evolution.

[102]  Lei Wang,et al.  Expanding the Genetic Code , 2003, Science.

[103]  S. Fox Self-ordered polymers and propagative cell-like systems , 2004, Naturwissenschaften.

[104]  Svante Wold,et al.  A multivariate study of the relationship between the genetic code and the physical-chemical properties of amino acids , 2005, Journal of Molecular Evolution.

[105]  Guy Sella,et al.  The Coevolution of Genes and Genetic Codes: Crick’s Frozen Accident Revisited , 2006, Journal of Molecular Evolution.

[106]  Freeman J. Dyson,et al.  A model for the origin of life , 2005, Journal of Molecular Evolution.

[107]  John R. Jungck,et al.  GENETIC CODES AS CODES: TOWARDS A THEORETICAL BASIS FOR BIOINFORMATICS , 2009 .