Reduced alphabet of prebiotic amino acids optimally encodes the conformational space of diverse extant protein folds

BackgroundThere is wide agreement that only a subset of the twenty standard amino acids existed prebiotically in sufficient concentrations to form functional polypeptides. We ask how this subset, postulated as {A,D,E,G,I,L,P,S,T,V}, could have formed structures stable enough to found metabolic pathways. Inspired by alphabet reduction experiments, we undertook a computational analysis to measure the structural coding behavior of sequences simplified by reduced alphabets. We sought to discern characteristics of the prebiotic set that would endow it with unique properties relevant to structure, stability, and folding.ResultsDrawing on a large dataset of single-domain proteins, we employed an information-theoretic measure to assess how well the prebiotic amino acid set preserves fold information against all other possible ten-amino acid sets. An extensive virtual mutagenesis procedure revealed that the prebiotic set excellently preserves sequence-dependent information regarding both backbone conformation and tertiary contact matrix of proteins. We observed that information retention is fold-class dependent: the prebiotic set sufficiently encodes the structure space of α/β and α + β folds, and to a lesser extent, of all-α and all-β folds. The prebiotic set appeared insufficient to encode the small proteins. Assessing how well the prebiotic set discriminates native vs. incorrect sequence-structure matches, we found that α/β and α + β folds exhibit more pronounced energy gaps with the prebiotic set than with nearly all alternatives.ConclusionsThe prebiotic set optimally encodes local backbone structures that appear in the folded environment and near-optimally encodes the tertiary contact matrix of extant proteins. The fold-class-specific patterns observed from our structural analysis confirm the postulated timeline of fold appearance in proteogenesis derived from proteomic sequence analyses. Polypeptides arising in a prebiotic environment will likely form α/β and α + β-like folds if any at all. We infer that the progressive expansion of the alphabet allowed the increased conformational stability and functional specificity of later folds, including all-α, all-β, and small proteins. Our results suggest that prebiotic sequences are amenable to mutations that significantly lower native conformational energies and increase discrimination amidst incorrect folds. This property may have assisted the genesis of functional proto-enzymes prior to the expansion of the full amino acid alphabet.

[1]  Stanley L. Miller,et al.  Organic Compound Synthes on the Primitive Eart: Several questions about the origin of life have been answered, but much remains to be studied , 1959 .

[2]  R. Jensen Enzyme recruitment in evolution of new function. , 1976, Annual review of microbiology.

[3]  W. Lim,et al.  Deciphering the message in protein sequences: tolerance to amino acid substitutions. , 1990, Science.

[4]  S. Rackovsky Quantitative organization of the known protein x‐ray structures. I. Methods and short‐length‐scale results , 1990, Proteins.

[5]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[6]  M. Sippl Calculation of conformational ensembles from potentials of mena force , 1990 .

[7]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[8]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[9]  S. Bryant,et al.  An empirical energy function for threading protein sequence through the folding motif , 1993, Proteins.

[10]  L. H. Bradley,et al.  Protein design by binary patterning of polar and nonpolar amino acids. , 1993, Methods in molecular biology.

[11]  E. Shakhnovich,et al.  Pseudodihedrals: Simplified protein backbone representation with knowledge‐based energy , 1994, Protein science : a publication of the Protein Society.

[12]  S. Bryant,et al.  Statistics of sequence-structure threading. , 1995, Current opinion in structural biology.

[13]  E I Shakhnovich,et al.  How the first biopolymers could have evolved. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[14]  D. Baker,et al.  Functional rapidly folding proteins from simplified amino acid sequences , 1997, Nature Structural Biology.

[15]  Ruben Recabarren,et al.  Estimating the total number of protein folds , 1999, Proteins.

[16]  G. Rose,et al.  Is protein folding hierarchic? I. Local structure and peptide folding. , 1999, Trends in biochemical sciences.

[17]  E V Koonin,et al.  Estimating the number of protein folds and families from complete genome data. , 2000, Journal of molecular biology.

[18]  S Rackovsky,et al.  Optimized representations and maximal information in proteins , 2000, Proteins.

[19]  P. Harbury,et al.  Reverse engineering the (β/α)8 barrel fold , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[20]  S. Akanuma,et al.  Combinatorial mutagenesis to restrict amino acid usage in an enzyme to a reduced set , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[21]  S. Rackovsky,et al.  Optimally informative backbone structural propensities in proteins , 2002, Proteins.

[22]  J. R. Fresco,et al.  Increased Frequency of Cysteine, Tyrosine, and Phenylalanine Residues Since the Last Universal Ancestor* , 2002, Molecular & Cellular Proteomics.

[23]  Mona Singh,et al.  Evolution of amino acid frequencies in proteins over deep time: inferred order of introduction of amino acids into the genetic code. , 2002, Molecular biology and evolution.

[24]  A. Sali,et al.  Statistical potentials for fold assessment , 2009 .

[25]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[26]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[27]  Gustavo Caetano-Anollés,et al.  An evolutionarily structured universe of protein architecture. , 2003, Genome research.

[28]  Edward N. Trifonov,et al.  The Triplet Code From First Principles , 2004, Journal of biomolecular structure & dynamics.

[29]  Edward N. Trifonov,et al.  Conserved Sequences of Prokaryotic Proteomes and Their Compositional Age , 2004, Journal of Molecular Evolution.

[30]  Sanne Abeln,et al.  Fold usage on genomes and protein fold evolution , 2005, Proteins.

[31]  Stanley L. Miller,et al.  Reasons for the occurrence of the twenty coded protein amino acids , 1981, Journal of Molecular Evolution.

[32]  E. Koonin,et al.  A universal trend of amino acid gain and loss in protein evolution , 2005, Nature.

[33]  H. Kacser,et al.  Evolution of catalytic proteins , 1984, Journal of Molecular Evolution.

[34]  Charlotte M. Deane,et al.  How old is your fold? , 2005, ISMB.

[35]  Stanley L. Miller,et al.  1 Prebiotic Chemistry on the Primitive Earth , 2006 .

[36]  P. Luisi The Emergence of Life: Autopoiesis: the logic of cellular life , 2006 .

[37]  E. Yeramian,et al.  Evolution of proteomes: fundamental signatures and global trends in amino acid compositions , 2006, BMC Genomics.

[38]  G. Rose,et al.  A backbone-based theory of protein folding , 2006, Proceedings of the National Academy of Sciences.

[39]  Pier Luigi Luisi,et al.  The Emergence Of Life , 2006 .

[40]  Armando D Solis,et al.  Improvement of statistical potentials and threading score functions using information maximization , 2006, Proteins.

[41]  Liang Shen,et al.  Distribution patterns of small-molecule ligands in the protein universe and implications for origin of life and drug discovery , 2007, Genome Biology.

[42]  Hong-yu Zhang,et al.  Protein Architecture Chronology Deduced From Structures of Amino Acid Synthases , 2007, Journal of biomolecular structure & dynamics.

[43]  E. Milner-White,et al.  Predicting the conformations of peptides and proteins in early evolution. A review article submitted to Biology Direct , 2008, Biology Direct.

[44]  Gustavo Caetano-Anollés,et al.  Reductive evolution of architectural repertoires in proteomes and the birth of the tripartite world. , 2007, Genome research.

[45]  Paramjit S Arora,et al.  Contemporary strategies for the stabilization of peptides in the alpha-helical conformation. , 2008, Current opinion in chemical biology.

[46]  Hong-yu Zhang,et al.  Characters of very ancient proteins. , 2008, Biochemical and biophysical research communications.

[47]  Armando D Solis,et al.  Information and discrimination in pairwise contact potentials , 2008, Proteins.

[48]  H. Santana,et al.  Which Amino Acids Should Be Used in Prebiotic Chemistry Studies? , 2008, Origins of Life and Evolution of Biospheres.

[49]  T. Yamazaki,et al.  Selection and structural analysis of de novo proteins from an α3β3 genetic library , 2009, Protein science : a publication of the Protein Society.

[50]  Dan S. Tawfik,et al.  Protein Dynamism and Evolvability , 2009, Science.

[51]  Harry Buhrman,et al.  The first peptides: the evolutionary transition between prebiotic amino acids and early proteins. , 2009, Journal of theoretical biology.

[52]  P. Higgs,et al.  A thermodynamic basis for prebiotic amino acid synthesis and the nature of the first genetic code. , 2009, Astrobiology.

[53]  D. Caetano-Anollés,et al.  The origin, evolution and structure of the protein world. , 2009, The Biochemical journal.

[54]  Yufen Zhao,et al.  Genome wide exploration of the origin and evolution of amino acids , 2010, BMC Evolutionary Biology.

[55]  Y. Qiu,et al.  Amino Acid Compositional Shifts During Streptophyte Transitions to Terrestrial Habitats , 2011, Journal of Molecular Evolution.

[56]  E. Milner-White,et al.  Functional Capabilities of the Earliest Peptides and the Emergence of Life , 2011, Genes.

[57]  S. Freeland,et al.  Did evolution select a nonrandom "alphabet" of amino acids? , 2011, Astrobiology.

[58]  Milner-White Ej,et al.  Functional capabilities of the earliest peptides and the emergence of life. , 2011 .

[59]  L. Longo,et al.  Protein design at the interface of the pre-biotic and biotic worlds. , 2012, Archives of biochemistry and biophysics.

[60]  N. Doi,et al.  Evolutionary Engineering of Artificial Proteins with Limited Sets of Primitive Amino Acids , 2012 .

[61]  Liam M Longo,et al.  Simplified protein design biased for prebiotic amino acids yields a foldable, halophilic protein , 2013, Proceedings of the National Academy of Sciences.

[62]  H. Schwalbe,et al.  Disorder and order in unfolded and disordered peptides and proteins: A view derived from tripeptide conformational analysis. II. Tripeptides with short side chains populating asx and β‐type like turn conformations , 2013, Proteins.

[63]  H. Schwalbe,et al.  Disorder and order in unfolded and disordered peptides and proteins: A view derived from tripeptide conformational analysis. I. Tripeptides with long and predominantly hydrophobic side chains , 2013, Proteins.

[64]  A. Solis,et al.  Deriving High-Resolution Protein Backbone Structure Propensities from All Crystal Data Using the Information Maximization Device , 2014, PloS one.

[65]  Steven E. Brenner,et al.  SCOPe: Structural Classification of Proteins—extended, integrating SCOP and ASTRAL data and classification of new structures , 2013, Nucleic Acids Res..

[66]  S. Freeland,et al.  Testing for adaptive signatures of amino acid alphabet evolution using chemistry space , 2014 .

[67]  J. Bada,et al.  A plausible simultaneous synthesis of amino acids and simple peptides on the primordial Earth. , 2014, Angewandte Chemie.

[68]  R. Schweitzer‐Stenner,et al.  Local Order in the Unfolded State: Conformational Biases and Nearest Neighbor Interactions , 2014, Biomolecules.

[69]  Ranjan V. Mannige Dynamic New World: Refining Our View of Protein Structure, Function and Evolution , 2014, Proteomes.

[70]  Annamária F. Ángyán,et al.  Are Proposed Early Genetic Codes Capable of Encoding Viable Proteins? , 2014, Journal of Molecular Evolution.

[71]  M. Madan Babu,et al.  A million peptide motifs for the molecular biologist. , 2014, Molecular cell.

[72]  Peter G Wolynes,et al.  Evolution, energy landscapes and the paradoxes of protein folding. , 2015, Biochimie.

[73]  S. Rackovsky Nonlinearities in protein space limit the utility of informatics in protein biophysics , 2015, Proteins.

[74]  Rong Liu,et al.  ATP selection in a random peptide library consisting of prebiotic amino acids. , 2015, Biochemical and biophysical research communications.

[75]  Armando D Solis,et al.  Amino acid alphabet reduction preserves fold information contained in contact interactions in proteins , 2015, Proteins.

[76]  O. Makhlynets,et al.  Short Self-Assembling Peptides Are Able to Bind to Copper and Activate Oxygen. , 2016, Angewandte Chemie.

[77]  Christina Karas,et al.  Are natural proteins special? Can we do that? , 2018, Current opinion in structural biology.

[78]  S. Akanuma,et al.  Comprehensive reduction of amino acid set in a protein suggests the importance of prebiotic amino acids for stable proteins , 2018, Scientific Reports.

[79]  Jesse A. Palmer,et al.  Reconstruction of cysteine biosynthesis using engineered cysteine-free enzymes , 2018, Scientific Reports.