A sequence‐coupled vector‐projection model for predicting the specificity of GalNAc‐transferase

The specificity of GalNAc‐transferase is consistent with the existence of an extended site composed of nine sub‐sites, denoted by R4, R3, R2, Rb R0, R1, R2′, R3′, and R4, where the acceptor at R0 is either Ser or Thr to which the reducing monosaccharide is being anchored. To predict whether a peptide will react with the enzyme to form a Ser‐ or Thr‐conjugated glycopeptide, a new method has been proposed based on the vector‐projection approach as well as the sequence‐coupled principle. By incorporating the sequence‐coupled effect among the subsites, the interaction mechanism among subsites during glycosylation can be reflected and, by using the vector projection approach, arbitrary assignment for insufficient experimental data can be avoided. The very high ratio of correct predictions versus total predictions for the data in both the training and the testing sets indicates that the method is self‐consistent and efficient. It provides a rapid means for predicting O‐glycosylation and designing effective inhibitors of GalNAc‐transferase, which might be useful for targeting drugs to specific sites in the body and for enzyme replacement therapy for the treatment of genetic disorders.

[1]  L. Rodén,et al.  Structure of the neutral trisaccharide of the chondroitin 4-sulfate-protein linkage region. , 1966, The Journal of biological chemistry.

[2]  [Amino acid sequence]. , 1970, Deutsche medizinische Wochenschrift.

[3]  R. U. Margolis,et al.  CARBOHYDRATE‐PEPTIDE LINKAGES IN GLYCOPROTEINS AND MUCOPOLYSACCHARIDES FROM BRAIN , 1972, Journal of neurochemistry.

[4]  E Harper,et al.  On the size of the active site in proteases: pronase. , 1972, Biochemical and biophysical research communications.

[5]  Gregory K. Miller,et al.  Elements of Applied Stochastic Processes , 1972 .

[6]  R. Hill,et al.  Ovine submaxillary mucin. Primary structure and peptide substrates of UDP-N-acetylgalactosamine:mucin transferase. , 1977, The Journal of biological chemistry.

[7]  J. Young,et al.  Enzymic O-glycosylation of synthetic peptides from sequences in basic myelin protein. , 1979, Biochemistry.

[8]  L. Hood,et al.  Amino acid sequence of a mouse immunoglobulin mu chain. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Nathan Sharon,et al.  Glycoproteins: research booming on long-ignored, ubiquitous compounds , 1981 .

[10]  R. Dwek,et al.  Structures of the sugar chains of rabbit immunoglobulin G: occurrence of asparagine-linked sugar chains in Fab fragment. , 1985, Biochemistry.

[11]  F. Oppenheim,et al.  Amino acid sequence of a proline-rich phosphoglycoprotein from parotid secretion of the subhuman primate Macaca fascicularis. , 1985, The Journal of biological chemistry.

[12]  Masaaki Goto,et al.  Production of Recombinant Human Erythropoietin in Mammalian Cells: Host–Cell Dependency of the Biological Activity of the Cloned Glycoprotein , 1988, Bio/Technology.

[13]  S. McPherson,et al.  Characterization of the Coleopteran–Specific Protein Gene of Bacillus thuringiensis Var. tenebrionis , 1988, Bio/Technology.

[14]  G. Hart,et al.  Nuclear and cytoplasmic glycosylation: novel saccharide linkages in unexpected places. , 1988, Trends in biochemical sciences.

[15]  L. Foddy,et al.  Assembly of asparagine-linked oligosaccharides in baby hamster kidney cells treated with castanospermine, an inhibitor of processing glucosidases. , 1988, European journal of biochemistry.

[16]  Charles F. Goochee,et al.  Environmental Effects on Protein Glycosylation , 1990, Bio/Technology.

[17]  A. Tomasselli,et al.  A cumulative specificity model for proteases from human immunodeficiency virus types 1 and 2, inferred from statistical analysis of an extended substrate data base. , 1991, The Journal of biological chemistry.

[18]  K. Chou,et al.  A correlation-coefficient method to predicting protein-structural classes from amino acid compositions. , 1992, European journal of biochemistry.

[19]  J. H. Collins,et al.  Amino acid sequence of human plasma galactoglycoprotein: identity with the extracellular region of CD43 (sialophorin). , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[20]  R. Brossmer,et al.  Structure of the N- and O-glycans of the A-chain of human plasma alpha 2HS-glycoprotein as deduced from the chemical compositions of the derivatives prepared by stepwise degradation with exoglycosidases. , 1992, Biochemistry.

[21]  K L Williams,et al.  Glycosylation sites identified by solid-phase Edman degradation: O-linked glycosylation motifs on human glycophorin A. , 1993, Glycobiology.

[22]  K. Chou,et al.  A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins. , 1993, The Journal of biological chemistry.

[23]  K. Chou,et al.  A vector projection approach to predicting HIV protease cleavage sites in proteins , 1993, Proteins.

[24]  Kuo-Chen Chou,et al.  A new approach to predicting protein folding types , 1993, Journal of protein chemistry.

[25]  R. Poorman,et al.  The specificity of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase as inferred from a database of in vivo substrates and from the in vitro glycosylation of proteins and peptides. , 1993, The Journal of biological chemistry.

[26]  P. Freemont,et al.  Crystal structure of the DNA modifying enzyme beta‐glucosyltransferase in the presence and absence of the substrate uridine diphosphoglucose. , 1994, The EMBO journal.

[27]  L. Warren,et al.  Bound Carbohydrates in Nature: The carbohydrates of glycoproteins , 1994 .

[28]  C. Zhang,et al.  An alternate-subsite-coupled model for predicting HIV protease cleavage sites in proteins. , 1994, Protein engineering.