Prediction of protein signal sequences and their cleavage sites

Protein signal sequences play a central role in the targeting and translocation of nearly all secreted proteins and many integral membrane proteins in both prokaryotes and eukaryotes. The knowledge of signal sequences has become a crucial tool for pharmaceutical scientists who genetically modify bacteria, plants, and animals to produce effective drugs. However, to effectively use such a tool, the first important thing is to find a fast and effective method to identify the “zipcode” entity; this is also evoked by both the huge amount of unprocessed data available and the industrial need to find more effective vehicles for the production of proteins in recombinant systems. In view of this, a sequence‐encoded algorithm was developed to identify the signal sequences and predict their cleavage sites. The rate of correct prediction for 1,939 secretory proteins and 1,440 nonsecretory proteins by self‐consistency test is 90.14% and that by jackknife test is 90.13%. The encouraging results indicate that the signal sequences share some common features although they lack similarity in sequence, length, and even composition and that they are predictable to a considerably accurate extent. Proteins 2001;42:136–139. © 2000 Wiley‐Liss, Inc.

[1]  K. Chou,et al.  Using discriminant function for prediction of subcellular location of prokaryotic proteins. , 1998, Biochemical and biophysical research communications.

[2]  K C Chou,et al.  Prediction of protein structural classes and subcellular locations. , 2000, Current protein & peptide science.

[3]  Michael Hagmann Protein ZIP Codes Make Nobel Journey , 1999, Science.

[4]  S. Brunak,et al.  SHORT COMMUNICATION Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites , 1997 .

[5]  T. Hubbard,et al.  Using neural networks for prediction of the subcellular location of proteins. , 1998, Nucleic acids research.

[6]  P. Aloy,et al.  Relation between amino acid composition and cellular location of proteins. , 1997, Journal of molecular biology.

[7]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[8]  M. Sternberg Protein Structure Prediction: A Practical Approach , 1997 .

[9]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[10]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[11]  L. Gierasch Signal sequences. , 1989, Biochemistry.

[12]  S. Brunak,et al.  Prediction of N-terminal protein sorting signals. , 1997, Current opinion in structural biology.

[13]  Lila M Gierasch,et al.  Signal Sequences: The Same Yet Different , 1996, Cell.

[14]  K. Chou,et al.  Protein subcellular location prediction. , 1999, Protein engineering.