Exploring genomes for glycosyltransferases.

Glycosyltransferases are one of the largest and most diverse enzyme groups in Nature. They catalyse the synthesis of glycosidic linkages by the transfer of a sugar residue from a donor to an acceptor substrate. These enzymes have been classified into families on the basis of amino acid sequence similarity that are kept updated in the Carbohydrate Active enZyme database (CAZy, ). The repertoire of glycosyltransferases in genomes is believed to determine the diversity of cellular glycan structures, and current estimates suggest that for most genomes about 1% of the coding regions are glycosyltransferases. However, plants tend to have far more glycosyltransferase genes than any other organism sequenced to date, and this can be explained by the highly complex polysaccharide network that form the cell wall and also by the numerous glycosylated secondary metabolites. In recent years, various bioinformatics strategies have been used to search bacterial and plant genomes for new glycosyltransferase genes. These are based on the use of remote homology detection methods that act at the 1D, 2D, and 3D level. The combined use of methods such as profile Hidden Markov Model (HMM) and fold recognition appears to be appropriate for this class of enzyme. Chemometric tools are also particularly well suited for obtaining an overview of multivariate data and revealing hidden latent information when dealing with large and highly complex datasets.

[1]  J. Paulson,et al.  Glycosyltransferases. Structure, localization, and control of cell type-specific glycosylation. , 1989, The Journal of biological chemistry.

[2]  A M Lesk,et al.  NAD-binding domains of dehydrogenases. , 1995, Current opinion in structural biology.

[3]  J. Paulson,et al.  The Sialyltransferase Sialylmotif Participates in Binding the Donor Substrate CMP-NeuAc (*) , 1995, The Journal of Biological Chemistry.

[4]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[5]  G J Davies,et al.  A classification of nucleotide-diphospho-sugar glycosyltransferases based on amino acid sequence similarities. , 1997, The Biochemical journal.

[6]  A. Imberty,et al.  Fold recognition and molecular modeling of a lectin-like domain in UDP-GalNac:polypeptide N-acetylgalactosaminyltransferases. , 1997, Protein engineering.

[7]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[8]  J. Paulson,et al.  Mutation of the Sialyltransferase S-sialylmotif Alters the Kinetics of the Donor and Acceptor Substrates* , 1998, The Journal of Biological Chemistry.

[9]  A. Imberty,et al.  Conserved structural features in eukaryotic and prokaryotic fucosyltransferases. , 1998, Glycobiology.

[10]  C. Szymanski,et al.  Evidence for a system of general protein glycosylation in Campylobacter jejuni , 1999, Molecular microbiology.

[11]  R Mollicone,et al.  Divergent evolution of fucosyltransferase genes from vertebrates, invertebrates, and bacteria. , 1999, Glycobiology.

[12]  B. Rost,et al.  A modified definition of Sov, a segment‐based measure for protein secondary structure prediction assessment , 1999, Proteins.

[13]  A. Imberty,et al.  Structure/function studies of glycosyltransferases. , 1999, Current opinion in structural biology.

[14]  M. Sternberg,et al.  Enhanced genome annotation using structural profiles in the program 3D-PSSM. , 2000, Journal of molecular biology.

[15]  Yan P. Yuan,et al.  Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames. , 2000, Nucleic acids research.

[16]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[17]  B. Samyn-Petit,et al.  The human sialyltransferase family. , 2001, Biochimie.

[18]  G. Deléage,et al.  Identification of related proteins with weak sequence identity using secondary structure information , 2000 .

[19]  James O. Wrabl,et al.  Homology between O-linked GlcNAc transferases and proteins of the glycogen phosphorylase superfamily. , 2001, Journal of molecular biology.

[20]  C. Breton,et al.  Structural and functional features of glycosyltransferases. , 2001, Biochimie.

[21]  A. Imberty,et al.  Comparative aspects of glycosyltransferases. , 2002, Biochemical Society symposium.

[22]  C. Szymanski,et al.  Structure of the N-Linked Glycan Present on Multiple Glycoproteins in the Gram-negative Bacterium, Campylobacter jejuni * , 2002, The Journal of Biological Chemistry.

[23]  Cyrus Chothia,et al.  SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments , 2002, Nucleic Acids Res..

[24]  Jonathan E. Allen,et al.  Genome sequence of the human malaria parasite Plasmodium falciparum , 2002, Nature.

[25]  P. Codogno,et al.  Common origin and evolution of glycosyltransferases using Dol-P-monosaccharides as donor substrate. , 2002, Molecular biology and evolution.

[26]  Liam J. McGuffin,et al.  Improvement of the GenTHREADER Method for Genomic Fold Recognition , 2003, Bioinform..

[27]  A. Imberty,et al.  Combining fold recognition and exploratory data analysis for searching for glycosyltransferases in the genome of Mycobacterium tuberculosis. , 2003, Biochimie.

[28]  S. Walker,et al.  Crystal structure of the MurG:UDP-GlcNAc complex reveals common structural principles of a superfamily of glycosyltransferases , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Fumiyasu Taniguchi,et al.  Crystal Structure of an α1,4-N-Acetylhexosaminyltransferase (EXTL2), a Member of the Exostosin Gene Family Involved in Heparan Sulfate Biosynthesis* , 2003, The Journal of Biological Chemistry.

[30]  D. Rigden,et al.  Fold recognition analysis of glycosyltransferase families: further members of structural superfamilies. , 2003, Glycobiology.

[31]  Christelle Breton,et al.  A new superfamily of protein-O-fucosyltransferases, α2-fucosyltransferases, and α6-fucosyltransferases: phylogeny and identification of conserved peptide motifs , 2003 .

[32]  Bernard Henrissat,et al.  An evolving hierarchical family classification for glycosyltransferases. , 2003, Journal of molecular biology.

[33]  Brad A. Chapman,et al.  Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events , 2003, Nature.

[34]  B. Henrissat,et al.  Why are there so many carbohydrate-active enzyme-related genes in plants? , 2003, Trends in plant science.

[35]  M. Kumar,et al.  Bacterial glycoproteins: Functions, biosynthesis and applications , 2003, Proteomics.

[36]  Arcady Mushegian,et al.  Three monophyletic superfamilies account for the majority of the known glycosyltransferases , 2003, Protein science : a publication of the Protein Society.

[37]  B. Petersen,et al.  A Complementary Bioinformatics Approach to Identify Potential Plant Cell Wall Glycosyltransferase-Encoding Genes1[w] , 2004, Plant Physiology.

[38]  P. Messner Prokaryotic Glycoproteins: Unexplored but Important , 2004, Journal of bacteriology.

[39]  B. Fertil,et al.  Analysis of the compositional biases in Plasmodium falciparum genome and proteome using Arabidopsis thaliana as a reference. , 2004, Gene.

[40]  Andrew G. Watts,et al.  Structural analysis of the sialyltransferase CstII from Campylobacter jejuni in complex with a substrate analog , 2004, Nature Structural &Molecular Biology.

[41]  D. Bowles,et al.  A class of plant glycosyltransferases involved in cellular homeostasis , 2004, The EMBO journal.

[42]  M. Sjöström,et al.  Recognition of Fold and Sugar Linkage for Glycosyltransferases by Multivariate Sequence Analysis*[boxs] , 2004, Journal of Biological Chemistry.

[43]  A. Imberty,et al.  Structure-Function Analysis of the Human Sialyltransferase ST3Gal I , 2004, Journal of Biological Chemistry.

[44]  C. Hutchison,et al.  Essential genes of a minimal bacterium. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Barbara Imperiali,et al.  Asparagine-linked protein glycosylation: from eukaryotic to prokaryotic systems. , 2006, Glycobiology.

[46]  Rasmus Bro,et al.  A modification of canonical variates analysis to handle highly collinear multivariate data , 2006 .

[47]  C. Olsen,et al.  Arabidopsis thaliana RGXT1 and RGXT2 Encode Golgi-Localized (1,3)-α-d-Xylosyltransferases Involved in the Synthesis of Pectic Rhamnogalacturonan-II[W][OA] , 2006, The Plant Cell Online.

[48]  K. Keegstra,et al.  Biosynthesis of plant cell wall polysaccharides - a complex process. , 2006, Current opinion in plant biology.

[49]  Antony Bacic,et al.  Plant cell wall biosynthesis: genetic, biochemical and functional genomics approaches to the identification of key genes. , 2006, Plant biotechnology journal.

[50]  Xuekun Xing,et al.  X-ray crystal structure of leukocyte type core 2 beta1,6-N-acetylglucosaminyltransferase. Evidence for a convergence of metal ion-independent glycosyltransferase mechanism. , 2006, The Journal of biological chemistry.

[51]  Naoyuki Taniguchi,et al.  Crystal structure of mammalian α1,6-fucosyltransferase, FUT8 , 2007 .

[52]  M. Proctor,et al.  Insights into the synthesis of lipopolysaccharide and antibiotics through the structures of two retaining glycosyltransferases from family GT4. , 2006, Chemistry & biology.

[53]  H. Narimatsu,et al.  Bioinformatics for comprehensive finding and analysis of glycosyltransferases. , 2006, Biochimica et biophysica acta.

[54]  Mikhail S. Gelfand,et al.  Recognition of Transmembrane Segments in Proteins: Review and Consistency-based Benchmarking of Internet Servers , 2006, J. Bioinform. Comput. Biol..

[55]  Susanne Sørensen,et al.  Biosynthesis of pectin , 2006 .

[56]  Chris Somerville,et al.  Cellulose synthesis in higher plants. , 2006, Annual review of cell and developmental biology.

[57]  J. Delettré,et al.  A generalized analysis of hydrophobic and loop clusters within globular protein sequences , 2007, BMC Structural Biology.

[58]  Jaroslav Koca,et al.  Structures and mechanisms of glycosyltransferases. , 2006, Glycobiology.

[59]  Debra Mohnen,et al.  Functional identification of an Arabidopsis pectin biosynthetic homogalacturonan galacturonosyltransferase. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[60]  Fangli Lu,et al.  cDNA sequences reveal considerable gene prediction inaccuracy in the Plasmodium falciparum genome , 2007, BMC Genomics.

[61]  S. Walker,et al.  Crystal structure of a peptidoglycan glycosyltransferase suggests a model for processive glycan chain synthesis , 2007, Proceedings of the National Academy of Sciences.

[62]  Gerald W. Hart,et al.  Cycling of O-linked β-N-acetylglucosamine on nucleocytoplasmic proteins , 2007, Nature.

[63]  S. Brunak,et al.  Locating proteins in the cell using TargetP, SignalP and related tools , 2007, Nature Protocols.

[64]  G. Widmalm,et al.  A processive lipid glycosyltransferase in the small human pathogen Mycoplasma pneumoniae: involvement in host immune response , 2007, Molecular microbiology.

[65]  S. Withers,et al.  Structural analysis of the alpha-2,3-sialyltransferase Cst-I from Campylobacter jejuni in apo and substrate-analogue bound forms. , 2007, Biochemistry.

[66]  A. H. Wang,et al.  Structure and Mechanism of Helicobacter pylori Fucosyltransferase , 2007, Journal of Biological Chemistry.

[67]  Daniel Lim,et al.  Structural Insight into the Transglycosylation Step of Bacterial Cell-Wall Biosynthesis , 2007, Science.

[68]  A. Bacic,et al.  Molecular characterization of two Arabidopsis thaliana glycosyltransferase mutants, rra1 and rra2, which have a reduced residual arabinose content in a polymer tightly associated with the cellulosic wall residue , 2007, Plant Molecular Biology.

[69]  Johan Trygg,et al.  Chemometrics in metabonomics. , 2007, Journal of proteome research.

[70]  G J Davies,et al.  Glycosyltransferases: structures, functions, and mechanisms. , 2008, Annual review of biochemistry.

[71]  Michael J E Sternberg,et al.  Exploring the extremes of sequence/structure space with ensemble fold recognition in the program Phyre , 2008, Proteins.

[72]  Junichi Watanabe,et al.  Inconsistencies of genome annotations in apicomplexan parasites revealed by 5'-end-one-pass and full-length sequences of oligo-capped cDNAs , 2009, BMC Genomics.

[73]  M. Pauly,et al.  Identification of a Xylogalacturonan Xylosyltransferase Involved in Pectin Biosynthesis in Arabidopsis[W][OA] , 2008, The Plant Cell Online.

[74]  D. Mohnen Pectin structure and biosynthesis. , 2008, Current opinion in plant biology.

[75]  Akash Ranjan,et al.  Genome bias influences amino acid choices: analysis of amino acid substitution and re-compilation of substitution matrices exclusive to an AT-biased genome , 2008, Nucleic acids research.

[76]  D. Kohda,et al.  Structure‐guided identification of a new catalytic motif of oligosaccharyltransferase , 2008, The EMBO journal.

[77]  Tanya Z. Berardini,et al.  The Arabidopsis Information Resource (TAIR): gene structure and function annotation , 2007, Nucleic Acids Res..

[78]  David T. Jones,et al.  pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination , 2009, Bioinform..

[79]  Brandi L. Cantarel,et al.  The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics , 2008, Nucleic Acids Res..

[80]  S. Withers,et al.  Structural insight into mammalian sialyltransferases , 2009, Nature Structural &Molecular Biology.

[81]  Q. Qian,et al.  BC10, a DUF266-containing and Golgi-located type II membrane protein, is required for cell-wall biosynthesis in rice (Oryza sativa L.). , 2009, The Plant journal : for cell and molecular biology.

[82]  A. Imberty,et al.  Combination of several bioinformatics approaches for the identification of new putative glycosyltransferases in Arabidopsis. , 2009, Journal of proteome research.

[83]  M. Sternberg,et al.  Protein structure prediction on the Web: a case study using the Phyre server , 2009, Nature Protocols.