Sialoglycan microarray encoding reveals differential sialoglycan binding of phylogenetically-related bacterial AB5 toxin B subunits

Vertebrate sialic acids (Sias) display much diversity in modifications, linkages and underlying glycans. Slide microarrays allow high-throughput analysis of sialoglycan-protein interactions. The preceding paper used ~150 structurally-defined sialyltrisaccharides with various Sias and modified forms at non-reducing ends, to compare pentameric sialoglycan-recognizing bacterial toxin B subunits. Unlike the poor correlation between B subunits and species phylogeny, there is stronger correlation with Sia types prominently expressed in susceptible species. Further supporting this pattern we report a B subunit(YenB) from Yersinia enterocolitica (broad host range) recognizing almost all sialoglycans in the microarray, including 4-O-acetylated-Sias not recognized by a Y.pestis orthologue(YpeB). Differential Sia-binding patterns were also observed with phylogenetically-related B subunits from Escherichia coli(SubB), Salmonella Typhi(PltB), S. Typhimurium(ArtB), extra-intestinal E.coli(EcPltB), Vibrio cholera(CtxB), and cholera family homologue of E. coli(EcxB). Given library size, data sorting and analysis posed a challenge. We devised a 9-digit code for trisaccharides with terminal Sias and underlying two monosaccharides assigned from the non-reducing end, with three digits assigning a monosaccharide, its modifications, and linkage. This code allows logical sorting, motif searching of results, and optimizes printing. While we developed the system for the >113,000 possible linear sialyltrisaccharides, we note that a biantennary N-glycan with two terminal sialoglycan tri saccharides could have >1010 potential combinations and a triantennary N-glycan with three terminal sequences, >1015 potential combinations. While all possibilities likely do not exist in nature, sialoglycans encode enormous diversity. Thus, while glycomic approaches address these challenges, naturally-occurring toxin B subunits are simpler tools to track the dynamic sialome in biological systems.

[1]  A. Varki,et al.  Are sialic acids involved in COVID-19 pathogenesis? , 2021, Glycobiology.

[2]  A. Varki,et al.  Reversible O-Acetyl Migration within the Sialic Acid Side Chain and Its Influence on Protein Recognition. , 2021, ACS chemical biology.

[3]  Benjamin P. Kellman,et al.  Big-Data Glycomics: Tools to Connect Glycan Biosynthesis to Extracellular Communication. , 2020, Trends in biochemical sciences.

[4]  Diogo M. Camacho,et al.  Deep-Learning Resources for Studying Glycan-Mediated Host-Microbe Interactions. , 2020, Cell host & microbe.

[5]  R. Cummings,et al.  Tools for generating and analyzing glycan microarray data , 2020, Beilstein journal of organic chemistry.

[6]  Kiyoko F. Aoki-Kinoshita,et al.  A consensus-based and readable extension of Linear Code for Reaction Rules (LiCoRR) , 2020, bioRxiv.

[7]  Jeffrey Chan,et al.  Identifying glycan motifs using a novel subtree mining approach , 2020, BMC Bioinformatics.

[8]  Richard D Cummings,et al.  GlyMDB: Glycan Microarray Database and analysis toolset , 2019, Bioinform..

[9]  Joseph Zaia,et al.  glypy - An open source glycoinformatics library. , 2019, Journal of proteome research.

[10]  A. Varki,et al.  From “Serum Sickness” to “Xenosialitis”: Past, Present, and Future Significance of the Non-human Sialic Acid Neu5Gc , 2019, Front. Immunol..

[11]  A. Varki,et al.  Redox-Controlled Site-Specific α2-6-Sialylation. , 2019, Journal of the American Chemical Society.

[12]  B. Haab,et al.  Deciphering Protein Glycosylation by Computational Integration of On-chip Profiling, Glycan-array Data, and Mass Spectrometry* , 2018, Molecular & Cellular Proteomics.

[13]  Kiyoko F. Aoki-Kinoshita,et al.  MCAW-DB: A glycan profile database capturing the ambiguity of glycan recognition patterns. , 2018, Carbohydrate research.

[14]  A. Varki,et al.  Chemoenzymatic Assembly of Mammalian O-Mannose Glycans. , 2018, Angewandte Chemie.

[15]  V. Nizet,et al.  Human evolutionary loss of epithelial Neu5Gc expression and species-specific susceptibility to cholera , 2018, PLoS pathogens.

[16]  Jennifer J. Kohler,et al.  GM1 ganglioside-independent intoxication by Cholera toxin , 2018, PLoS pathogens.

[17]  Fabian J Theis,et al.  Network inference from glycoproteomics data reveals new reactions in the IgG glycosylation pathway , 2017, Nature Communications.

[18]  David F. Smith,et al.  Mining High-Complexity Motifs in Glycans: A New Language To Uncover the Fine Specificities of Lectins and Glycosidases , 2017, Analytical chemistry.

[19]  Kiyoko F. Aoki-Kinoshita,et al.  GlyTouCan: an accessible glycan structure repository. , 2017, Glycobiology.

[20]  Matthew D. Johnson,et al.  Structure–function analyses of a pertussis-like toxin from pathogenic Escherichia coli reveal a distinct mechanism of inhibition of trimeric G-proteins , 2017, The Journal of Biological Chemistry.

[21]  Xi Chen,et al.  Diversity-Oriented Enzymatic Modular Assembly of ABO Histo-blood Group Antigens , 2016 .

[22]  Ben M. Webb,et al.  Comparative Protein Structure Modeling Using MODELLER , 2016, Current protocols in bioinformatics.

[23]  A. Varki,et al.  Novel aspects of sialoglycan recognition by the Siglec-like domains of streptococcal SRR glycoproteins. , 2016, Glycobiology.

[24]  Evan Bolton,et al.  Symbol Nomenclature for Graphical Representations of Glycans. , 2015, Glycobiology.

[25]  Jennifer J. Kohler,et al.  Fucosylation and protein glycosylation create functional receptors for cholera toxin , 2015, eLife.

[26]  A. Varki,et al.  Host Adaptation of a Bacterial Toxin from the Human Pathogen Salmonella Typhi , 2014, Cell.

[27]  T. Beddoe,et al.  EcxAB is a founding member of a new family of metalloprotease AB5 toxins with a hybrid cholera-like B subunit. , 2013, Structure.

[28]  A. Varki,et al.  Exploration of Sialic Acid Diversity and Biology Using Sialoglycan Microarrays , 2013, Biopolymers.

[29]  Sanjay Agravat,et al.  Automated motif discovery from glycan array data. , 2012, Omics : a journal of integrative biology.

[30]  David F. Smith,et al.  Cross-comparison of Protein Recognition of Sialic Acid Diversity on Two Novel Sialoglycan Microarrays* , 2012, The Journal of Biological Chemistry.

[31]  B. Kan,et al.  Pathogenic Strains of Yersinia enterocolitica Isolated from Domestic Dogs (Canis familiaris) Belonging to Farmers Are of the Same Subtype as Pathogenic Y. enterocolitica Strains Isolated from Humans and May Be a Source of Human Infection in Jiangsu Province, China , 2010, Journal of Clinical Microbiology.

[32]  Ajit Varki,et al.  Advances in the biology and chemistry of sialic acids. , 2010, ACS chemical biology.

[33]  Richard D. Cummings,et al.  The repertoire of glycan determinants in the human glycome. , 2009, Molecular bioSystems.

[34]  David F. Smith,et al.  Incorporation of a non-human glycan mediates human susceptibility to a bacterial toxin , 2008, Nature.

[35]  Philip V. Toukach,et al.  Sharing of worldwide distributed carbohydrate-related digital resources: online connection of the Bacterial Carbohydrate Structure DataBase and GLYCOSCIENCES.de , 2007, Nucleic Acids Res..

[36]  Kiyoko F. Aoki-Kinoshita,et al.  KEGG as a glycome informatics resource. , 2006, Glycobiology.

[37]  Wei Lang,et al.  Advancing glycomics: implementation strategies at the consortium for functional glycomics. , 2006, Glycobiology.

[38]  William S York,et al.  GLYDE-an expressive XML standard for the representation of glycan structure. , 2005, Carbohydrate research.

[39]  Serge Pérez,et al.  Prospects for glycoinformatics. , 2005, Current opinion in structural biology.

[40]  Toshihide Shikanai,et al.  The carbohydrate sequence markup language (CabosML): an XML description of carbohydrate structures , 2005, Bioinform..

[41]  C. Lieth An Endorsement to Create Open Access Databases for Analytical Data of Complex Carbohydrates , 2004 .

[42]  Cédric Notredame,et al.  3DCoffee: combining protein sequences and structures within multiple sequence alignments. , 2004, Journal of molecular biology.

[43]  Martin Frank,et al.  Bioinformatics for glycomics: Status, methods, requirements and perspectives , 2004, Briefings Bioinform..

[44]  Lolke de Haan,et al.  Cholera toxin: A paradigm for multi-functional engagement of cellular mechanisms (Review) , 2004, Molecular membrane biology.

[45]  S. Yamasaki,et al.  Cloning and Characterization of Genes Encoding Homologues of the B Subunit of Cholera Toxin and the Escherichia coli Heat-Labile Enterotoxin from Clinical Isolates of Citrobacter freundii and E. coli , 2002, Infection and Immunity.

[46]  C. W. von der Lieth,et al.  LINUCS: linear notation for unique description of carbohydrate sequences. , 2001, Carbohydrate research.

[47]  D. Belin,et al.  Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter , 1995, Journal of bacteriology.

[48]  M. Mantle,et al.  Binding of Yersinia enterocolitica to purified, native small intestinal mucins from rabbits and humans involves interactions with the mucin carbohydrate moiety , 1994, Infection and immunity.

[49]  M. Mantle,et al.  Adhesion of Yersinia enterocolitica to purified rabbit and human intestinal mucin , 1993, Infection and immunity.

[50]  C. Pai,et al.  Yersinia enterocolitica: Mechanisms of microbial pathogenesis and pathophysiology of diarrhoea , 1990, Journal of gastroenterology and hepatology.

[51]  K Bock,et al.  The Complex Carbohydrate Structure Database. , 1989, Trends in biochemical sciences.

[52]  J. Feeley,et al.  Virulence and phenotypic characterization of Yersinia enterocolitica isolated from humans in the United States , 1983, Journal of clinical microbiology.

[53]  J. Feeley,et al.  Epidemic Yersinia enterocolitica infection due to contaminated chocolate milk. , 1978, The New England journal of medicine.

[54]  S. Toma,et al.  Survey on the incidence of Yersinia enterocolitica infection in Canada. , 1974, Applied microbiology.

[55]  B. Haab,et al.  The detection and discovery of glycan motifs in biological samples using lectins and antibodies: new methods and opportunities. , 2015, Advances in cancer research.

[56]  Wonjun Park,et al.  An alpha-numeric code for representing N-linked glycan structures in secreted glycoproteins , 2009, Bioprocess and biosystems engineering.

[57]  A. Varki,et al.  Siglecs--the major subfamily of I-type lectins. , 2006, Glycobiology.

[58]  M. Alan Chester,et al.  INTERNATIONAL UNION OF PURE AND APPLIED CHEMISTRY AND INTERNATIONAL UNION OF BIOCHEMISTRY AND MOLECULAR BIOLOGY , 1997 .

[59]  R. Laine,et al.  A calculation of all possible oligosaccharide isomers both branched and linear yields 1.05 x 10(12) structures for a reducing hexasaccharide: the Isomer Barrier to development of single-method saccharide sequencing or synthesis systems. , 1994, Glycobiology.

[60]  G. Blix,et al.  Sialic Acids , 1955, Nature.

[61]  Yaniv Altshuler,et al.  Glycoforum a Novel Linear Code ® Nomenclature for Complex Carbohydrates , 2022 .