Predicting secretory protein signal sequence cleavage sites by fusing the marks of global alignments

Summary.A newly synthesized secretory protein in cells bears a special sequence, called signal peptide or sequence, which plays the role of “address tag” in guiding the protein to wherever it is needed. Such a unique function of signal sequences has stimulated novel strategies for drug design or reprogramming cells for gene therapy. To realize these new ideas and plans, however, it is important to develop an automated method for fast and accurately identifying the signal sequences or their cleavage sites. In this paper, a new method is developed for predicting the signal sequence of a query secretory protein by fusing the results from a series of global alignments through a voting system. The very high success rates thus obtained suggest that the novel approach is very promising, and that the new method may become a useful vehicle in identifying signal sequence, or at least serve as a complementary tool to the existing algorithms of this field.

[1]  Gert Lubec,et al.  Searching for hypothetical proteins: Theory and practice based upon original data and literature , 2005, Progress in Neurobiology.

[2]  K. Chou,et al.  Prediction of protein signal sequences and their cleavage sites by statistical rulers. , 2005, Biochemical and biophysical research communications.

[3]  Kuo-Chen Chou,et al.  Predicting protein subcellular location by fusing multiple classifiers , 2006, Journal of cellular biochemistry.

[4]  D. McGeoch,et al.  On the predictive recognition of signal peptide sequences. , 1985, Virus research.

[5]  Guo-Ping Zhou,et al.  An Intriguing Controversy over Protein Structural Class Prediction , 1998, Journal of protein chemistry.

[6]  Chun Yan,et al.  Prediction of protein subcellular location using a combined feature of sequence , 2005, FEBS letters.

[7]  Kuo-Chen Chou,et al.  Using pseudo amino acid composition to predict protein structural classes: Approached with complexity measure factor , 2006, J. Comput. Chem..

[8]  G. Li,et al.  Classifying G protein-coupled receptors and nuclear receptors on the basis of protein power spectrum from fast Fourier transform , 2006, Amino Acids.

[9]  Anders Krogh,et al.  Prediction of Signal Peptides and Signal Anchors by a Hidden Markov Model , 1998, ISMB.

[10]  Z. Feng,et al.  Prediction of the subcellular location of prokaryotic proteins based on a new representation of the amino acid composition. , 2001, Biopolymers.

[11]  S. Brunak,et al.  SHORT COMMUNICATION Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites , 1997 .

[12]  Kuo-Chen Chou,et al.  Predicting protein structural class with AdaBoost Learner. , 2006, Protein and peptide letters.

[13]  Z. Huang,et al.  Using complexity measure factor to predict protein subcellular location , 2005, Amino Acids.

[14]  K.-C. Chou,et al.  Using string kernel to predict signal peptide cleavage site based on subsite coupling model , 2005, Amino Acids.

[15]  J. Gordon,et al.  Computer-assisted predictions of signal peptidase processing sites. , 1987, Biochemical and biophysical research communications.

[16]  S.-W. Zhang,et al.  Prediction of protein homo-oligomer types by pseudo amino acid composition: Approached with an improved feature extraction and Naive Bayes Feature Fusion , 2006, Amino Acids.

[17]  R. Durbin,et al.  Base qualities help sequencing software. , 1998, Genome research.

[18]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[19]  Patrizio Arrigo,et al.  Identification of a new motif on nucleic acid sequence data using Kohonen's self-organizing map , 1991, Comput. Appl. Biosci..

[20]  Peixiang Cai,et al.  Predicting protein structural class with pseudo-amino acid composition and support vector machine fusion network. , 2006, Analytical biochemistry.

[21]  S. Brunak,et al.  Improved prediction of signal peptides: SignalP 3.0. , 2004, Journal of molecular biology.

[22]  Z. Huang,et al.  Using cellular automata images and pseudo amino acid composition to predict protein subcellular location , 2005, Amino Acids.

[23]  G. Heijne A new method for predicting signal sequence cleavage sites. , 1986 .

[24]  István Csabai,et al.  Improving signal peptide prediction accuracy by simulated neural network , 1991, Comput. Appl. Biosci..

[25]  P. Wrede,et al.  Signal analysis of protein targeting sequences , 1993 .

[26]  K. Chou Using subsite coupling to predict signal peptides. , 2001, Protein engineering.

[27]  X.-D. Sun,et al.  Prediction of protein structural classes using support vector machines , 2006, Amino Acids.

[28]  Zhi-Ping Feng,et al.  An overview on predicting the subcellular location of a protein , 2002, Silico Biol..

[29]  Kuo-Chen Chou,et al.  Prediction of protein signal sequences. , 2002, Current protein & peptide science.

[30]  G Schneider,et al.  Analysis of cleavage-site patterns in protein precursor sequences with a perceptron-type neural network. , 1993, Biochemical and biophysical research communications.

[31]  K. Nakai Protein sorting signals and prediction of subcellular localization. , 2000, Advances in protein chemistry.

[32]  G. Heijne,et al.  ChloroP, a neural network‐based method for predicting chloroplast transit peptides and their cleavage sites , 1999, Protein science : a publication of the Protein Society.

[33]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[34]  Guo-Ping Zhou,et al.  Subcellular location prediction of apoptosis proteins , 2002, Proteins.

[35]  F E Cohen,et al.  Pairwise sequence alignment below the twilight zone. , 2001, Journal of molecular biology.

[36]  Sean R. Eddy,et al.  Biological sequence analysis: Contents , 1998 .

[37]  Meng Wang,et al.  SLLE for predicting membrane protein types. , 2005, Journal of theoretical biology.

[38]  K. Chou,et al.  Prediction of protein signal sequences and their cleavage sites , 2001, Proteins.

[39]  K. Chou Structural bioinformatics and its impact to biomedical science. , 2004, Current medicinal chemistry.

[40]  Pierre Baldi,et al.  Bioinformatics - the machine learning approach (2. ed.) , 2000 .

[41]  K. Chou Prediction of signal peptides using scaled window , 2001, Peptides.

[42]  Zhi-Ping Feng,et al.  Prediction of protein structural class by amino acid and polypeptide composition. , 2002, European journal of biochemistry.

[43]  R. Durbin,et al.  Biological sequence analysis: Background on probability , 1998 .

[44]  Z. Wen,et al.  Delaunay triangulation with partial least squares projection to latent structures: a model for G-protein coupled receptors classification and fast structure recognition , 2007, Amino Acids.