Prediction of O-linked glycosylation sites in protein by independent component analysis

Glycosylation is one of the most important posttranslation modifications steps in eukaryotic cell. In this paper, we propose a new approach based on independent component analysis (ICA) for prediction O-linked glycosylation site and pattern analysis. Principal component analysis (PCA) is first used to find significant uncorrelated components, and then ICA is used to extract independent components to construct a subspace (main basis) of protein sequence. The prediction is viewed as a 2-classes classification problem. The test protein vector is projected to each subspace. The protein sequence is classified into the nearest class by calculating the distance between the test vector and its projection on the subspace. The prediction accuracy of our proposed new approach is higher than that of other subspace methods based on PCA

[1]  G von Heijne,et al.  Amino acid distributions around O-linked glycosylation sites. , 1991, The Biochemical journal.

[2]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[3]  R. Poorman,et al.  The specificity of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase as inferred from a database of in vivo substrates and from the in vitro glycosylation of proteins and peptides. , 1993, The Journal of biological chemistry.

[4]  Yen-Wei Chen,et al.  Principal Component Analysis of O-linked Glycosylation Sites in Protein Sequence , 2007, Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2007).

[5]  Yen-Wei Chen,et al.  Principal Component Analysis for Prediction of O-Linked Glycosylation Sites in Protein by Multi-Layered Neural Networks , 2009, 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[6]  S. Brunak,et al.  Prediction, conservation analysis, and structural characterization of mammalian mucin-type O-glycosylation sites. , 2005, Glycobiology.

[7]  P. Bork,et al.  Prediction of potential GPI-modification sites in proprotein sequences. , 1999, Journal of molecular biology.

[8]  Rong Zeng,et al.  Predicting O-glycosylation sites in mammalian proteins by using SVMs , 2006, Comput. Biol. Chem..

[9]  Kazutoshi Sakakibara,et al.  Prediction of the O-glycosylation Sites in Protein by Layered Neural Networks and Support Vector Machines , 2006, KES.

[10]  Yen-Wei Chen,et al.  Pattern Analysis and Prediction of O -Linked Glycosylation Sites in Protein by Principal Component Subspace Analysis , 2007, KES.