Exploring the Evolution of Novel Enzyme Functions within Structurally Defined Protein Superfamilies

In order to understand the evolution of enzyme reactions and to gain an overview of biological catalysis we have combined sequence and structural data to generate phylogenetic trees in an analysis of 276 structurally defined enzyme superfamilies, and used these to study how enzyme functions have evolved. We describe in detail the analysis of two superfamilies to illustrate different paradigms of enzyme evolution. Gathering together data from all the superfamilies supports and develops the observation that they have all evolved to act on a diverse set of substrates, whilst the evolution of new chemistry is much less common. Despite that, by bringing together so much data, we can provide a comprehensive overview of the most common and rare types of changes in function. Our analysis demonstrates on a larger scale than previously studied, that modifications in overall chemistry still occur, with all possible changes at the primary level of the Enzyme Commission (E.C.) classification observed to a greater or lesser extent. The phylogenetic trees map out the evolutionary route taken within a superfamily, as well as all the possible changes within a superfamily. This has been used to generate a matrix of observed exchanges from one enzyme function to another, revealing the scale and nature of enzyme evolution and that some types of exchanges between and within E.C. classes are more prevalent than others. Surprisingly a large proportion (71%) of all known enzyme functions are performed by this relatively small set of 276 superfamilies. This reinforces the hypothesis that relatively few ancient enzymatic domain superfamilies were progenitors for most of the chemistry required for life.

[1]  E. Webb Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. , 1992 .

[2]  T L Blundell,et al.  FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. , 2001, Journal of molecular biology.

[3]  M. Groll,et al.  The 26S proteasome: assembly and function of a destructive machine. , 2010, Trends in biochemical sciences.

[4]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[5]  Kazutaka Katoh,et al.  Recent developments in the MAFFT multiple sequence alignment program , 2008, Briefings Bioinform..

[6]  P. Babbitt,et al.  Evolution of enzyme superfamilies. , 2006, Current opinion in chemical biology.

[7]  Gemma L. Holliday,et al.  Understanding the functional roles of amino acid residues in enzyme catalysis. , 2009, Journal of molecular biology.

[8]  Conrad C. Huang,et al.  Representing Structure-Function Relationships in Mechanistically Diverse Enzyme Superfamilies , 2004, Pacific Symposium on Biocomputing.

[9]  R. A. George,et al.  A ligand-centric analysis of the diversity and evolution of protein-ligand relationships in E.coli. , 2005, Journal of molecular biology.

[10]  Patricia C. Babbitt,et al.  Quantitative Comparison of Catalytic Mechanisms and Overall Reactions in Convergently Evolved Enzymes: Implications for Classification of Enzyme Function , 2010, PLoS Comput. Biol..

[11]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[12]  Christian Betzel,et al.  Structural insights into the catalytic mechanism of sphingomyelinases D and evolutionary relationship to glycerophosphodiester phosphodiesterases. , 2006, Biochemical and biophysical research communications.

[13]  D. Wolf,et al.  The Active Sites of the Eukaryotic 20 S Proteasome and Their Involvement in Subunit Precursor Processing* , 1997, The Journal of Biological Chemistry.

[14]  Ian Sillitoe,et al.  Extending CATH: increasing coverage of the protein structure universe and linking structure with function , 2010, Nucleic Acids Res..

[15]  Matthew H J Cordes,et al.  Molecular evolution, functional variation, and proposed nomenclature of the gene family that includes sphingomyelinase D in sicariid spider venoms. , 2008, Molecular biology and evolution.

[16]  Dan S. Tawfik,et al.  Role of chemistry versus substrate binding in recruiting promiscuous enzyme functions. , 2011, Biochemistry.

[17]  W. S. Valdar,et al.  Scoring residue conservation , 2002, Proteins.

[18]  Shoshana D. Brown,et al.  A gold standard set of mechanistically diverse enzyme superfamilies , 2006, Genome Biology.

[19]  Meimei Xu,et al.  Following evolution's lead to a single residue switch for diterpene synthase product outcome , 2007, Proceedings of the National Academy of Sciences.

[20]  P. Kersey,et al.  In Silico Characterization of Proteins: UniProt, InterPro and Integr8 , 2008, Molecular biotechnology.

[21]  Annabel E. Todd,et al.  Evolution of function in protein superfamilies, from a structural perspective. , 2001, Journal of molecular biology.

[22]  Robert Huber,et al.  Molecular Machines for Protein Degradation , 2005, Chembiochem : a European journal of chemical biology.

[23]  Mark N. Wass,et al.  Convergent evolution of enzyme active sites is not a rare phenomenon. , 2007, Journal of molecular biology.

[24]  Gemma L. Holliday,et al.  MACiE: exploring the diversity of biochemical reactions , 2011, Nucleic Acids Res..

[25]  Patricia C. Babbitt,et al.  Evolutionary Potential of (β/α)8-Barrels: Functional Promiscuity Produced by Single Substitutions in the Enolase Superfamily† , 2003 .

[26]  Gabrielle A. Reeves,et al.  Structural diversity of domain superfamilies in the CATH database. , 2006, Journal of molecular biology.

[27]  Tim J. P. Hubbard,et al.  Data growth and its impact on the SCOP database: new developments , 2007, Nucleic Acids Res..

[28]  Peter Murray-Rust,et al.  MACiE (Mechanism, Annotation and Classification in Enzymes): novel tools for searching catalytic mechanisms , 2006, Nucleic Acids Res..

[29]  Gail J. Bartlett,et al.  Catalysing new reactions during evolution: economy of residues and mechanism. , 2003, Journal of molecular biology.

[30]  Jeremy Minshull,et al.  Evolutionary potential of (beta/alpha)8-barrels: functional promiscuity produced by single substitutions in the enolase superfamily. , 2003, Biochemistry.

[31]  Tao Liu,et al.  TreeFam: 2008 Update , 2007, Nucleic Acids Res..

[32]  C. Orengo CORA—Topological fingerprints for protein structural families , 2008, Protein science : a publication of the Protein Society.

[33]  C. Chothia,et al.  The generation of new protein functions by the combination of domains. , 2007, Structure.

[34]  John P. Overington,et al.  How many drug targets are there? , 2006, Nature Reviews Drug Discovery.

[35]  Benoit H. Dessailly,et al.  Exploiting structural classifications for function prediction: towards a domain grammar for protein function. , 2009, Current opinion in structural biology.

[36]  Frances M. G. Pearl,et al.  The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution , 2006, Nucleic Acids Res..

[37]  Dan S. Tawfik Loop Grafting and the Origins of Enzyme Species , 2006, Science.

[38]  Ian Sillitoe,et al.  FunTree: a resource for exploring the functional evolution of structurally defined enzyme superfamilies , 2011, Nucleic Acids Res..

[39]  Gustavo Caetano-Anollés,et al.  An evolutionarily structured universe of protein architecture. , 2003, Genome research.

[40]  D. Herschlag,et al.  Catalytic promiscuity and the evolution of new enzymatic activities. , 1999, Chemistry & biology.

[41]  Asif U. Tamuri,et al.  ArchSchema: a tool for interactive graphing of related Pfam domain architectures , 2010, Bioinform..

[42]  Steven E Brenner,et al.  Phylogenetic molecular function annotation. , 2009, Journal of physics. Conference series.

[43]  Sung-Hun Nam,et al.  Design and Evolution of New Catalytic Activity with an Existing Protein Scaffold , 2006, Science.