Evolutionarily Conserved Substrate Substructures for Automated Annotation of Enzyme Superfamilies

The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized and uncharacterized enzyme superfamilies.

[1]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[2]  N H Horowitz,et al.  On the Evolution of Biochemical Syntheses. , 1945, Proceedings of the National Academy of Sciences of the United States of America.

[3]  John J Irwin,et al.  Predicting substrates by docking high-energy intermediates to enzyme structures. , 2006, Journal of the American Chemical Society.

[4]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[5]  J. J. Díaz-Mejía,et al.  A network perspective on the evolution of metabolism by gene duplication , 2007, Genome Biology.

[6]  P. Babbitt,et al.  Divergent evolution of enzymatic function: mechanistically diverse superfamilies and functionally distinct suprafamilies. , 2001, Annual review of biochemistry.

[7]  W. Patrick,et al.  Natural history as a predictor of protein evolvability. , 2006, Protein engineering, design & selection : PEDS.

[8]  J. Gerlt,et al.  Evolution of enzymatic activities in the enolase superfamily: functional assignment of unknown proteins in Bacillus subtilis and Escherichia coli as L-Ala-D/L-Glu epimerases. , 2001, Biochemistry.

[9]  Patricia C Babbitt,et al.  Mechanisms of protein evolution and their application to protein engineering. , 2007, Advances in enzymology and related areas of molecular biology.

[10]  V. Bryson,et al.  Evolving Genes and Proteins. , 1965, Science.

[11]  Peter Murray-Rust,et al.  MACiE (Mechanism, Annotation and Classification in Enzymes): novel tools for searching catalytic mechanisms , 2006, Nucleic Acids Res..

[12]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[13]  F. Raushel,et al.  Structural and catalytic diversity within the amidohydrolase superfamily. , 2005, Biochemistry.

[14]  Egon L. Willighagen,et al.  The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo-and Bioinformatics , 2003, J. Chem. Inf. Comput. Sci..

[15]  Nozomi Nagano,et al.  EzCatDB: the Enzyme Catalytic-mechanism Database , 2004, Nucleic Acids Res..

[16]  Dan S. Tawfik,et al.  The 'evolvability' of promiscuous protein functions , 2005, Nature Genetics.

[17]  Janet M. Thornton,et al.  Comparison of functional annotation schemes for genomes , 2000, Functional & Integrative Genomics.

[18]  C. Orengo,et al.  Evolution of protein function, from a structural perspective. , 1999, Current opinion in chemical biology.

[19]  Antje Chang,et al.  BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009 , 2008, Nucleic Acids Res..

[20]  D. Church,et al.  Cross-species sequence comparisons: a review of methods and available resources. , 2003, Genome research.

[21]  D. Frick,et al.  The MutT Proteins or “Nudix” Hydrolases, a Family of Versatile, Widely Distributed, “Housecleaning” Enzymes* , 1996, The Journal of Biological Chemistry.

[22]  C Sander,et al.  An evolutionary treasure: unification of a broad set of amidohydrolases related to urease , 1997, Proteins.

[23]  M. Jacobson,et al.  Virtual screening against highly charged active sites: identifying substrates of alpha-beta barrel enzymes. , 2005, Biochemistry.

[24]  P. Babbitt Definitions of enzyme function for the structural genomics era. , 2003, Current opinion in chemical biology.

[25]  P. Babbitt,et al.  Evolution of enzyme superfamilies. , 2006, Current opinion in chemical biology.

[26]  Heidi J. Imker,et al.  Prediction and assignment of function for a divergent N-succinyl amino acid racemase. , 2007, Nature chemical biology.

[27]  S. Copley,et al.  Evolution of a metabolic pathway for degradation of a toxic xenobiotic: the patchwork approach. , 2000, Trends in biochemical sciences.

[28]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[29]  Marc A. Martí-Renom,et al.  DBAli tools: mining the protein structure space , 2007, Nucleic Acids Res..

[30]  W. Pearson,et al.  The limits of protein sequence comparison? , 2005, Current opinion in structural biology.

[31]  Lawrence Hunter,et al.  Predicting Enzyme Function from Sequence: A Systematic Appraisal , 1997, ISMB.

[32]  Michael J E Sternberg,et al.  Evolution of enzymes in metabolism: a network perspective. , 2002, Journal of molecular biology.

[33]  J. Handelsman,et al.  Metagenomics: genomic analysis of microbial communities. , 2004, Annual review of genetics.

[34]  Keith F. Tipton,et al.  History of the enzyme nomenclature system , 2000, Bioinform..

[35]  G. H. Reed,et al.  The enolase superfamily: a general strategy for enzyme-catalyzed abstraction of the alpha-protons of carboxylic acids. , 1996, Biochemistry.

[36]  P. Bork,et al.  Metabolites: a helping hand for pathway evolution? , 2003, Trends in biochemical sciences.

[37]  John Alan Gerlt,et al.  Evolution of function in (β/α)8-barrel enzymes , 2003 .

[38]  Conrad C. Huang,et al.  Leveraging enzyme structure-function relationships for functional inference and experimental design: the structure-function linkage database. , 2006, Biochemistry.

[39]  M. Massiah,et al.  Structures and mechanisms of Nudix hydrolases. , 2005, Archives of biochemistry and biophysics.

[40]  M. Kanehisa,et al.  Computational assignment of the EC numbers for genomic-scale analysis of enzymatic reactions. , 2004, Journal of the American Chemical Society.

[41]  Annabel E. Todd,et al.  From protein structure to function. , 1999, Current opinion in structural biology.

[42]  J. Gerlt A Protein Structure (or Function ?) Initiative. , 2007, Structure.

[43]  Antje Chang,et al.  BRENDA, AMENDA and FRENDA: the enzyme information system in 2007 , 2007, Nucleic Acids Res..

[44]  Karen N. Allen,et al.  Phosphoryl group transfer: evolution of a catalytic scaffold. , 2004, Trends in biochemical sciences.

[45]  Janet M. Thornton,et al.  The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data , 2004, Nucleic Acids Res..

[46]  Gemma L. Holliday,et al.  Using reaction mechanism to measure enzyme similarity. , 2007, Journal of molecular biology.

[47]  Johannes C. Hermann,et al.  Structure-based activity prediction for an enzyme of unknown function , 2007, Nature.

[48]  Sara Light,et al.  Network analysis of metabolic enzyme evolution in Escherichia coli , 2004, BMC Bioinformatics.

[49]  N. Horowitz,et al.  The Evolution of Biochemical Syntheses — Retrospect and Prospect , 1965 .

[50]  John A Gerlt,et al.  Evolution of function in ( b / a ) 8-barrel enzymes , 2003 .

[51]  R. A. George,et al.  A ligand-centric analysis of the diversity and evolution of protein-ligand relationships in E.coli. , 2005, Journal of molecular biology.

[52]  Patricia C. Babbitt,et al.  Understanding Enzyme Superfamilies , 1997, The Journal of Biological Chemistry.

[53]  Antje Chang,et al.  BRENDA , the enzyme database : updates and major new developments , 2003 .

[54]  H M Holden,et al.  The crotonase superfamily: divergently related enzymes that catalyze different reactions involving acyl coenzyme a thioesters. , 2001, Accounts of chemical research.

[55]  Ivan Rayment,et al.  Divergent evolution in the enolase superfamily: the interplay of mechanism and specificity. , 2005, Archives of biochemistry and biophysics.

[56]  C. Chothia,et al.  The generation of new protein functions by the combination of domains. , 2007, Structure.

[57]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[58]  Patricia C. Babbitt,et al.  Evolutionary Potential of (β/α)8-Barrels: Functional Promiscuity Produced by Single Substitutions in the Enolase Superfamily† , 2003 .

[59]  Patricia C Babbitt,et al.  Stability for function trade-offs in the enolase superfamily "catalytic module". , 2007, Biochemistry.

[60]  Iddo Friedberg,et al.  Automated protein function predictionçthe genomic challenge , 2006 .