Functional and informatics analysis enables glycosyltransferase activity prediction

The elucidation and prediction of how changes in a protein result in altered activities and selectivities remain a major challenge in chemistry. Two hurdles have prevented accurate family-wide models: obtaining (i) diverse datasets and (ii) suitable parameter frameworks that encapsulate activities in large sets. Here, we show that a relatively small but broad activity dataset is sufficient to train algorithms for functional prediction over the entire glycosyltransferase superfamily 1 (GT1) of the plant Arabidopsis thaliana. Whereas sequence analysis alone failed for GT1 substrate utilization patterns, our chemical–bioinformatic model, GT-Predict, succeeded by coupling physicochemical features with isozyme-recognition patterns over the family. GT-Predict identified GT1 biocatalysts for novel substrates and enabled functional annotation of uncharacterized GT1s. Finally, analyses of GT-Predict decision pathways revealed structural modulators of substrate recognition, thus providing information on mechanisms. This multifaceted approach to enzyme prediction may guide the streamlined utilization (and design) of biocatalysts and the discovery of other family-wide protein functions.Bioinformatic analysis coupled to substrate-reactivity profiling for the glycosyltransferase (GT) enzyme superfamily supports the development of ‘GT-Predict’ as a tool for functional prediction of GT–substrate relationships.

[1]  R. Dixon,et al.  Crystal Structures of a Multifunctional Triterpene/Flavonoid Glycosyltransferase from Medicago truncatula , 2005, The Plant Cell Online.

[2]  Pedro M. Coutinho,et al.  The carbohydrate-active enzymes database (CAZy) in 2013 , 2013, Nucleic Acids Res..

[3]  C. Guillemette,et al.  Nomenclature update for the mammalian UDP glycosyltransferase (UGT) gene superfamily. , 2005, Pharmacogenetics and genomics.

[4]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[5]  R. Seenivasagam,et al.  PMDB: Plant Metabolome Database—A Metabolomic Approach , 2010, Medicinal Chemistry Research.

[6]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[7]  N. Bruce,et al.  Regioselective glucosylation of aromatic compounds: screening of a recombinant glycosyltransferase library to identify biocatalysts. , 2006, Angewandte Chemie.

[8]  James B. Brewer,et al.  A randomized, double-blind, placebo-controlled trial of resveratrol for Alzheimer disease , 2015, Neurology.

[9]  Matthew S Sigman,et al.  Predicting and optimizing asymmetric catalyst performance using the principles of experimental design and steric parameters , 2011, Proceedings of the National Academy of Sciences.

[10]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[11]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[12]  Donald Hilvert,et al.  De novo enzymes by computational design. , 2013, Current opinion in chemical biology.

[13]  B. G. Davis,et al.  High‐Throughput Mass‐Spectrometry Monitoring for Multisubstrate Enzymes: Determining the Kinetic Parameters and Catalytic Activities of Glycosyltransferases , 2005, Chembiochem : a European journal of chemical biology.

[14]  Wolfgang H. B. Sauer,et al.  Molecular Shape Diversity of Combinatorial Libraries: A Prerequisite for Broad Bioactivity , 2003, J. Chem. Inf. Comput. Sci..

[15]  Wei-Guo Zhu,et al.  Characterization and Prediction of Lysine (K)-Acetyl-Transferase Specific Acetylation Sites* , 2011, Molecular & Cellular Proteomics.

[16]  Roberto Todeschini,et al.  In Silico Prediction of Cytochrome P450-Drug Interaction: QSARs for CYP3A4 and CYP2C9 , 2016, International journal of molecular sciences.

[17]  B. G. Davis,et al.  Uptake of unnatural trehalose analogs as a reporter for Mycobacterium tuberculosis. , 2011, Nature chemical biology.

[18]  G. Davies,et al.  Characterization and engineering of the bifunctional N- and O-glucosyltransferase involved in xenobiotic metabolism in plants , 2007, Proceedings of the National Academy of Sciences.

[19]  Benjamin G Davis,et al.  Structural dissection and high-throughput screening of mannosylglycerate synthase , 2005, Nature Structural &Molecular Biology.

[20]  Ritesh Kumar,et al.  Discovery of new enzymes and metabolic pathways using structure and genome context , 2016 .

[21]  M. Garcia-Conesa,et al.  Resveratrol and Clinical Trials: The Crossroad from In Vitro Studies to Human Evidence , 2013, Current pharmaceutical design.

[22]  M. Kanehisa,et al.  Predictive genomic and metabolomic analysis for the standardization of enzyme data , 2014 .

[23]  Baojian Wu,et al.  Understanding substrate selectivity of human UDP-glucuronosyltransferases through QSAR modeling and analysis of homologous enzymes , 2012, Xenobiotica; the fate of foreign compounds in biological systems.

[24]  D. Learmonth A Novel, Convenient Synthesis of the 3‐O‐β‐D‐ and 4′‐O‐β‐D‐Glucopyranosides of trans‐Resveratrol , 2004 .

[25]  P C Babbitt,et al.  Mechanistically diverse enzyme superfamilies: the importance of chemistry in the evolution of catalysis. , 1998, Current opinion in chemical biology.

[26]  G. Davies,et al.  A glycosynthase catalyst for the synthesis of flavonoid glycosides. , 2007, Angewandte Chemie.

[27]  G J Davies,et al.  Glycosyltransferases: structures, functions, and mechanisms. , 2008, Annual review of biochemistry.

[28]  William R Pearson,et al.  Protein Function Prediction: Problems and Pitfalls , 2015, Current protocols in bioinformatics.

[29]  Patrik Lundström,et al.  Structural and functional innovations in the real-time evolution of new (βα)8 barrel enzymes , 2017, Proceedings of the National Academy of Sciences.

[30]  C. Orengo,et al.  Plasticity of enzyme active sites. , 2002, Trends in biochemical sciences.

[31]  D. Hougaard,et al.  Resveratrol reduces the levels of circulating androgen precursors but has no effect on, testosterone, dihydrotestosterone, PSA levels or prostate volume. A 4‐month randomised trial in middle‐aged men , 2015, The Prostate.

[32]  Yi Li,et al.  Evolution of substrate recognition across a multigene family of glycosyltransferases in Arabidopsis. , 2003, Glycobiology.

[33]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[34]  Young-Soo Hong,et al.  Enzymatic Biosynthesis of Novel Resveratrol Glucoside and Glycoside Derivatives , 2014, Applied and Environmental Microbiology.

[35]  J. Thompson,et al.  The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. , 1997, Nucleic acids research.

[36]  Juergen Pleiss,et al.  Biochemical profiling in silico--predicting substrate specificities of large enzyme families. , 2006, Journal of biotechnology.

[37]  P. Emsley,et al.  Features and development of Coot , 2010, Acta crystallographica. Section D, Biological crystallography.

[38]  Tao Wang,et al.  The advancement of multidimensional QSAR for novel drug discovery - where are we headed? , 2017, Expert opinion on drug discovery.

[39]  P. Ferrari,et al.  Crystal structures of two human pyrophosphorylase isoforms in complexes with UDPGlc(Gal)NAc: role of the alternatively spliced insert in the enzyme oligomeric assembly and active site architecture , 2001, The EMBO journal.

[40]  William R Pearson,et al.  Selecting the Right Similarity‐Scoring Matrix , 2013, Current protocols in bioinformatics.

[41]  C. Ford,et al.  Structure of a flavonoid glucosyltransferase reveals the basis for plant natural product modification , 2006 .

[42]  D. Higgins,et al.  Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega , 2011, Molecular systems biology.

[43]  J. Errey,et al.  Probing the breadth of macrolide glycosyltransferases: in vitro remodeling of a polyketide antibiotic creates active bacterial uptake and enhances potency. , 2005, Journal of the American Chemical Society.

[44]  Kathleen Marchal,et al.  A network-based approach to identify substrate classes of bacterial glycosyltransferases , 2014, BMC Genomics.

[45]  Yang Zhang,et al.  I-TASSER: a unified platform for automated protein structure and function prediction , 2010, Nature Protocols.

[46]  L. Heide The aminocoumarins: biosynthesis and biology. , 2009, Natural product reports.

[47]  R. Dixon,et al.  Crystal structures of glycosyltransferase UGT78G1 reveal the molecular basis for glycosylation and deglycosylation of (iso)flavonoids. , 2009, Journal of molecular biology.

[48]  Søren Bak,et al.  Substrate specificity of plant UDP-dependent glycosyltransferases predicted from crystal structures and homology modeling. , 2009, Phytochemistry.

[49]  D. Bowles,et al.  Identification of Glucosyltransferase Genes Involved in Sinapate Metabolism and Lignin Synthesis in Arabidopsis * , 2001, The Journal of Biological Chemistry.

[50]  Takao Yokota,et al.  Plant foods and herbal sources of resveratrol. , 2002, Journal of agricultural and food chemistry.

[51]  R. Dixon,et al.  A functional genomics approach to (iso)flavonoid glycosylation in the model legume Medicago truncatula , 2007, Plant Molecular Biology.

[52]  D. Heider,et al.  Bacterial Glycosyltransferases: Challenges and Opportunities of a Highly Diverse Enzyme Class Toward Tailoring Natural Products , 2016, Front. Microbiol..

[53]  C. Busch,et al.  Resveratrol as a Pan-HDAC Inhibitor Alters the Acetylation Status of Jistone Proteins in Human-Derived Hepatoblastoma Cells , 2013, PloS one.

[54]  C. Kleanthous,et al.  A Kinetic Analysis of Regiospecific Glucosylation by Two Glycosyltransferases of Arabidopsis thaliana , 2008, Journal of Biological Chemistry.

[55]  David S. Wishart,et al.  DrugBank 4.0: shedding new light on drug metabolism , 2013, Nucleic Acids Res..

[56]  T. Gloster Advances in understanding glycosyltransferases from a structural perspective , 2014, Current opinion in structural biology.

[57]  J. Aubé,et al.  Probing Chemical Space with Alkaloid-Inspired Libraries , 2014, Nature chemistry.

[58]  R. Marmorstein,et al.  Structure and mechanism of non‐histone protein acetyltransferase enzymes , 2013, The FEBS journal.

[59]  Annabel E. Todd,et al.  Evolution of function in protein superfamilies, from a structural perspective. , 2001, Journal of molecular biology.

[60]  G. Davies,et al.  Conformational analyses of the reaction coordinate of glycosidases. , 2012, Accounts of chemical research.

[61]  S. Baldauf,et al.  Phylogenetic Analysis of the UDP-glycosyltransferase Multigene Family of Arabidopsis thaliana * 210 , 2001, The Journal of Biological Chemistry.

[62]  J. Rini,et al.  X‐ray crystal structure of rabbit N‐acetylglucosaminyltransferase I: catalytic mechanism and a new protein superfamily , 2000, The EMBO journal.