Primary structure of Gal beta 1,3(4)GlcNAc alpha 2,3-sialyltransferase determined by mass spectrometry sequence analysis and molecular cloning. Evidence for a protein motif in the sialyltransferase gene family.

The Gal beta 1,3(4)GlcNAc alpha 2,3-sialyltransferase forms the NeuAc alpha 2,3Gal beta 1,3(4)GlcNAc sequences found in terminal carbohydrate groups of glycoproteins and glycolipids. High energy collision-induced dissociation analysis of tryptic peptides from only 300 pmol of the purified Gal beta 1,3(4)GlcNAc alpha 2,3-sialyltransferase provided 25% of the total amino acid sequence and led to the successful cloning of this enzyme. The peptide sequence information was used to design short degenerate primers for use in the polymerase chain reaction. A long specific cDNA fragment was amplified which was used to isolate a clone from a rat liver cDNA library. The cloned cDNA encodes a 374-amino acid protein containing an amino-terminal signal-anchor sequence characteristic of all cloned glycosyltransferases and produced sialyltransferase activity when transiently expressed in COS-1 cells. When compared with two other cloned sialyltransferases, the primary structure of Gal beta 1,3(4)GlcNAc alpha 2,3-sialyltransferase revealed a homologous region in all three enzymes consisting of a stretch of 55 amino acids located in their catalytic domains. This feature together with lack of homology in the remaining 85% of the sequence of the three sialyltransferases defines a pattern of sequence homology not found in cloned cDNAs of other glycosyltransferase families.