The Natural History of Biocatalytic Mechanisms

Phylogenomic analysis of the occurrence and abundance of protein domains in proteomes has recently showed that the α/β architecture is probably the oldest fold design. This holds important implications for the origins of biochemistry. Here we explore structure-function relationships addressing the use of chemical mechanisms by ancestral enzymes. We test the hypothesis that the oldest folds used the most mechanisms. We start by tracing biocatalytic mechanisms operating in metabolic enzymes along a phylogenetic timeline of the first appearance of homologous superfamilies of protein domain structures from CATH. A total of 335 enzyme reactions were retrieved from MACiE and were mapped over fold age. We define a mechanistic step type as one of the 51 mechanistic annotations given in MACiE, and each step of each of the 335 mechanisms was described using one or more of these annotations. We find that the first two folds, the P-loop containing nucleotide triphosphate hydrolase and the NAD(P)-binding Rossmann-like homologous superfamilies, were α/β architectures responsible for introducing 35% (18/51) of the known mechanistic step types. We find that these two oldest structures in the phylogenomic analysis of protein domains introduced many mechanistic step types that were later combinatorially spread in catalytic history. The most common mechanistic step types included fundamental building blocks of enzyme chemistry: “Proton transfer,” “Bimolecular nucleophilic addition,” “Bimolecular nucleophilic substitution,” and “Unimolecular elimination by the conjugate base.” They were associated with the most ancestral fold structure typical of P-loop containing nucleotide triphosphate hydrolases. Over half of the mechanistic step types were introduced in the evolutionary timeline before the appearance of structures specific to diversified organisms, during a period of architectural diversification. The other half unfolded gradually after organismal diversification and during a period that spanned ∼2 billion years of evolutionary history.

[1]  Gustavo Caetano-Anollés,et al.  Global Patterns of Protein Domain Gain and Loss in Superkingdoms , 2014, PLoS Comput. Biol..

[2]  Gustavo Caetano-Anollés,et al.  Structural Phylogenomics Retrodicts the Origin of the Genetic Code and Uncovers the Evolutionary Impact of Protein Flexibility , 2013, PloS one.

[3]  G. Caetano-Anollés,et al.  Structural Phylogenomics Reveals Gradual Evolutionary Replacement of Abiotic Chemistries by Protein Enzymes in Purine Metabolism , 2013, PloS one.

[4]  Gustavo Caetano-Anollés,et al.  Origin and Evolution of Protein Fold Designs Inferred from Phylogenomic Analysis of CATH Domain Structures in Proteomes , 2013, PLoS Comput. Biol..

[5]  J. Roth,et al.  Real-Time Evolution of New Genes by Innovation, Amplification, and Divergence , 2012, Science.

[6]  A. Nasir,et al.  Benefits of Using Molecular Structure and Abundance in Phylogenomic Analysis , 2012, Front. Gene..

[7]  D. Barker,et al.  The evolution of nitrogen fixation in cyanobacteria , 2012, Bioinform..

[8]  Gemma L. Holliday,et al.  MACiE: exploring the diversity of biochemical reactions , 2011, Nucleic Acids Res..

[9]  Ian Sillitoe,et al.  FunTree: a resource for exploring the functional evolution of structurally defined enzyme superfamilies , 2011, Nucleic Acids Res..

[10]  Gemma L. Holliday,et al.  Characterizing the complexity of enzymes on the basis of their mechanisms and structures with a bio-computational analysis , 2011, The FEBS journal.

[11]  A. Wagner,et al.  Evolutionary Innovations and the Organization of Protein Functions in Genotype Space , 2010, PloS one.

[12]  Paul G Falkowski,et al.  The Evolution and Future of Earth’s Nitrogen Cycle , 2010, Science.

[13]  K. Mizuguchi,et al.  Relationships between functional subclasses and information contained in active‐site and ligand‐binding residues in diverse superfamilies , 2010, Proteins.

[14]  G. Caetano-Anollés,et al.  Emergence and evolution of modern molecular functions inferred from phylogenomic analysis of ontological data. , 2010, Molecular biology and evolution.

[15]  A. Elofsson,et al.  Structure is three to ten times more conserved than sequence—A study of structural response in protein cores , 2009, Proteins.

[16]  Gemma L. Holliday,et al.  Understanding the functional roles of amino acid residues in enzyme catalysis. , 2009, Journal of molecular biology.

[17]  D. Caetano-Anollés,et al.  The origin, evolution and structure of the protein world. , 2009, The Biochemical journal.

[18]  Elizabeth S. Spelke,et al.  Children’s understanding of the relationship between addition and subtraction , 2008, Cognition.

[19]  Hong-yu Zhang,et al.  Characters of very ancient proteins. , 2008, Biochemical and biophysical research communications.

[20]  Gemma L. Holliday,et al.  The chemistry of protein catalysis. , 2007, Journal of molecular biology.

[21]  Gustavo Caetano-Anollés,et al.  The origin of modern metabolic networks inferred from phylogenomic analysis of protein architecture , 2007, Proceedings of the National Academy of Sciences.

[22]  Frances M. G. Pearl,et al.  The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution , 2006, Nucleic Acids Res..

[23]  Philip E. Bourne,et al.  Modern proteomes contain putative imprints of ancient shifts in trace metal geochemistry , 2006, Proceedings of the National Academy of Sciences.

[24]  Peter Murray-Rust,et al.  MACiE (Mechanism, Annotation and Classification in Enzymes): novel tools for searching catalytic mechanisms , 2006, Nucleic Acids Res..

[25]  Gustavo Caetano-Anollés,et al.  A phylogenomic reconstruction of the protein world based on a genomic census of protein fold architecture , 2006, Complex..

[26]  Jay E. Mittenthal,et al.  MANET: tracing evolution of protein architecture in metabolic networks , 2006, BMC Bioinformatics.

[27]  J. Thornton,et al.  Conformational diversity of ligands bound to proteins. , 2006, Journal of molecular biology.

[28]  MACiE: a database of enzyme reaction mechanisms , 2005, Bioinform..

[29]  Nozomi Nagano,et al.  EzCatDB: the Enzyme Catalytic-mechanism Database , 2004, Nucleic Acids Res..

[30]  Jason Raymond,et al.  The natural history of nitrogen fixation. , 2004, Molecular biology and evolution.

[31]  Gustavo Caetano-Anollés,et al.  An evolutionarily structured universe of protein architecture. , 2003, Genome research.

[32]  Frances M. G. Pearl,et al.  Quantifying the similarities within fold space. , 2002, Journal of molecular biology.

[33]  M. McElroy,et al.  Fixation of Nitrogen in the Prebiotic Atmosphere , 1979, Science.

[34]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[35]  P. Babbitt,et al.  Towards mechanistic classification of enzyme functions , 2013 .

[36]  Gustavo Caetano-Anollés,et al.  A universal molecular clock of protein folds and its power in tracing the early history of aerobic metabolism and planet oxygenation. , 2011, Molecular biology and evolution.

[37]  Charlotte M. Deane,et al.  How old is your fold? , 2005, ISMB.

[38]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[39]  Yoichi Kawamura,et al.  Systematic Analyses of P-Loop Containing Nucleotide Triphosphate Hydrolase Superfamily Based on Sequence, Structure and Function , 2003 .

[40]  D. Ord,et al.  PAUP:Phylogenetic analysis using parsi-mony , 1993 .

[41]  E. Webb Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. , 1992 .

[42]  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm355 Sequence analysis , 2022 .