Adaptive Properties of the Genetically Encoded Amino Acid Alphabet Are Inherited from Its Subsets

Life uses a common set of 20 coded amino acids (CAAs) to construct proteins. This set was likely canonicalized during early evolution; before this, smaller amino acid sets were gradually expanded as new synthetic, proofreading and coding mechanisms became biologically available. Many possible subsets of the modern CAAs or other presently uncoded amino acids could have comprised the earlier sets. We explore the hypothesis that the CAAs were selectively fixed due to their unique adaptive chemical properties, which facilitate folding, catalysis, and solubility of proteins, and gave adaptive value to organisms able to encode them. Specifically, we studied in silico hypothetical CAA sets of 3–19 amino acids comprised of 1913 structurally diverse α-amino acids, exploring the adaptive value of their combined physicochemical properties relative to those of the modern CAA set. We find that even hypothetical sets containing modern CAA members are especially adaptive; it is difficult to find sets even among a large choice of alternatives that cover the chemical property space more amply. These results suggest that each time a CAA was discovered and embedded during evolution, it provided an adaptive value unusual among many alternatives, and each selective step may have helped bootstrap the developing set to include still more CAAs.

[1]  Alexandre Varnek,et al.  Estimation of the size of drug-like chemical space based on GDB-17 data , 2013, Journal of Computer-Aided Molecular Design.

[2]  Marie desJardins,et al.  Amino acid quantitative structure property relationship database: a web-based platform for quantitative investigations of amino acids. , 2007, Protein engineering, design & selection : PEDS.

[3]  Michel Schneider,et al.  UniProtKB/Swiss-Prot. , 2007, Methods in molecular biology.

[4]  Arlin Stoltzfus,et al.  The Exchangeability of Amino Acids in Proteins , 2005, Genetics.

[5]  Rainer Brüggemann,et al.  Partial Order in Environmental Sciences and Chemistry , 2006 .

[6]  CHARLES J. EPSTEIN,et al.  Non-randomness of Ammo-acid Changes in the Evolution of Homologous Proteins , 1967, Nature.

[7]  Taobo Hu,et al.  Coevolution Theory of the Genetic Code at Age Forty: Pathway to Translation and Synthetic Life , 2016, Life.

[8]  M Levitt,et al.  From structure to sequence and back again. , 1996, Journal of molecular biology.

[9]  Patrick L. Griffin,et al.  Functional information and the emergence of biocomplexity , 2007, Proceedings of the National Academy of Sciences.

[10]  J. Wong,et al.  Inadequacy of prebiotic synthesis as origin of proteinous amino acids , 1979, Journal of Molecular Evolution.

[11]  L. Hurst,et al.  The Genetic Code Is One in a Million , 1998, Journal of Molecular Evolution.

[12]  E N Trifonov,et al.  Consensus temporal order of amino acids and evolution of the triplet code. , 2000, Gene.

[13]  M Di Giulio The Coevolution Theory of the Origin of the Genetic Code , 1999, Journal of molecular evolution.

[14]  Genetic Code Evolution Started with the Incorporation of Glycine, Followed by Other Small Hydrophilic Amino Acids , 2014, Journal of Molecular Evolution.

[15]  J. Chin,et al.  Cellular incorporation of unnatural amino acids and bioorthogonal labeling of proteins. , 2014, Chemical reviews.

[16]  R. Grantham Amino Acid Difference Formula to Help Explain Protein Evolution , 1974, Science.

[17]  A. Doig Frozen, but no accident – why the 20 standard amino acids were selected , 2017, The FEBS journal.

[18]  C. Dobson Chemical space and biology , 2004, Nature.

[19]  J. Wong,et al.  Membership mutation of the genetic code: loss of fitness by tryptophan. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[20]  K. Ikehara Possible steps to the emergence of life: the [GADV]-protein world hypothesis. , 2005, Chemical record.

[21]  Robert P Bywater,et al.  Why twenty amino acid residue types suffice(d) to support all living systems , 2018, PloS one.

[22]  Jesse A. Palmer,et al.  Reconstruction of cysteine biosynthesis using engineered cysteine-free enzymes , 2018, Scientific Reports.

[23]  Douglas J. Klein,et al.  Partial Orderings in Chemistry , 1997, J. Chem. Inf. Comput. Sci..

[24]  W. Fitch,et al.  The phylogeny of tRNA sequences provides evidence for ambiguity reduction in the origin of the genetic code. , 1987, Cold Spring Harbor symposia on quantitative biology.

[25]  Eugene V Koonin,et al.  Origin and Evolution of the Universal Genetic Code. , 2017, Annual review of genetics.

[26]  Takahiro Hohsaka,et al.  Incorporation of non-natural amino acids into proteins. , 2002, Current opinion in chemical biology.

[27]  F. Heylighen The Growth of Structural and Functional Complexity during Evolution , 1999 .

[28]  Peter G. Schultz,et al.  A chemical toolkit for proteins — an expanded genetic code , 2006, Nature Reviews Molecular Cell Biology.

[29]  P. Higgs A four-column theory for the origin of the genetic code: tracing the evolutionary pathways that gave rise to an optimized code , 2009, Biology Direct.

[30]  S. Freeland,et al.  Did evolution select a nonrandom "alphabet" of amino acids? , 2011, Astrobiology.

[31]  Andrew D. Ellington,et al.  Selection and Characterization of Escherichia coliVariants Capable of Growth on an Otherwise Toxic Tryptophan Analogue , 2001, Journal of bacteriology.

[32]  Herbert Waldmann,et al.  Exploring and exploiting biologically relevant chemical space. , 2011, Current drug targets.

[33]  J. Wong,et al.  Coevolution theory of the genetic code at age thirty. , 2005, BioEssays : news and reviews in molecular, cellular and developmental biology.

[34]  S. Lok,et al.  Mutations Enabling Displacement of Tryptophan by 4-Fluorotryptophan as a Canonical Amino Acid of the Genetic Code , 2014, Genome biology and evolution.

[35]  Steven E. Massey,et al.  A Neutral Origin for Error Minimization in the Genetic Code , 2008, Journal of Molecular Evolution.

[36]  Stephen Freeland,et al.  Testing the potential for computational chemistry to quantify biophysical properties of the non-proteinaceous amino acids. , 2006, Astrobiology.

[37]  Adalbert Kerber,et al.  MOLGEN 5.0, A Molecular Structure Generator , 2014 .

[38]  K. Wanner,et al.  Methods and Principles in Medicinal Chemistry , 2007 .

[39]  S. Freeland,et al.  Testing for adaptive signatures of amino acid alphabet evolution using chemistry space , 2014 .

[40]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[41]  S. Miller A production of amino acids under possible primitive earth conditions. , 1953, Science.

[42]  Stephen J. Freeland,et al.  Unearthing the Root of Amino Acid Similarity , 2013, Journal of Molecular Evolution.

[43]  Bo Yu,et al.  Size estimation of chemical space: how big is it? , 2012, The Journal of pharmacy and pharmacology.

[44]  Peter F. Stadler,et al.  On the Evolution of Primitive Genetic Codes , 2003, Origins of life and evolution of the biosphere.

[45]  Susan Budavari,et al.  The Merck index : an encyclopedia of chemicals, drugs, and biologicals , 1983 .

[46]  P. Sneath Relations between chemical structure and biological activity in peptides. , 1966, Journal of theoretical biology.

[47]  E. Alm,et al.  Ancestral Reconstruction of a Pre-LUCA Aminoacyl-tRNA Synthetase Ancestor Supports the Late Addition of Trp to the Genetic Code , 2015, Journal of Molecular Evolution.

[48]  M. Toșa,et al.  Modern diversification of the amino acid repertoire driven by oxygen , 2017, Proceedings of the National Academy of Sciences.

[49]  Lisa Dresner,et al.  The Merck Index An Encyclopedia Of Chemicals Drugs And Biologicals , 2016 .

[50]  M. Meringer,et al.  Exploring astrobiology using in silico molecular structure generation , 2017, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[51]  S. Miyazawa,et al.  Two types of amino acid substitutions in protein evolution , 1979, Journal of Molecular Evolution.

[52]  S. Freeland,et al.  Extraordinarily Adaptive Properties of the Genetically Encoded Amino Acids , 2015, Scientific Reports.

[53]  M. Di Giulio A Non-neutral Origin for Error Minimization in the Origin of the Genetic Code , 2018, Journal of Molecular Evolution.

[54]  P. Higgs,et al.  A thermodynamic basis for prebiotic amino acid synthesis and the nature of the first genetic code. , 2009, Astrobiology.

[55]  Adaptive Properties of the Amino Acid Alphabet and its Subsets , 2018 .

[56]  Markus Meringer,et al.  Beyond Terrestrial Biology: Charting the Chemical Universe of α-Amino Acid Structures , 2013, J. Chem. Inf. Model..

[57]  J. Wong A co-evolution theory of the genetic code. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[58]  H. Cleaves,et al.  The origin of the biologically coded amino acids. , 2010, Journal of theoretical biology.

[59]  A. Cornish-Bowden,et al.  From L'Homme Machine to metabolic closure: steps towards understanding life. , 2011, Journal of theoretical biology.

[60]  P. Wipf,et al.  Stochastic voyages into uncharted chemical space produce a representative library of all possible drug-like compounds. , 2013, Journal of the American Chemical Society.