Prediction of membrane protein types and subcellular locations

Membrane proteins are classified according to two different schemes. In scheme 1, they are discriminated among the following five types: (1) type I single‐pass transmembrane, (2) type II single‐pass transmembrane, (3) multipass transmembrane, (4) lipid chain‐anchored membrane, and (5) GPI‐anchored membrane proteins. In scheme 2, they are discriminated among the following nine locations: (1) chloroplast, (2) endoplasmic reticulum, (3) Golgi apparatus, (4) lysosome, (5) mitochondria, (6) nucleus, (7) peroxisome, (8) plasma, and (9) vacuole. An algorithm is formulated for predicting the type or location of a given membrane protein based on its amino acid composition. The overall rates of correct prediction thus obtained by both self‐consistency and jackknife tests, as well as by an independent dataset test, were around 76–81% for the classification of five types, and 66–70% for the classification of nine cellular locations. Furthermore, classification and prediction were also conducted between inner and outer membrane proteins; the corresponding rates thus obtained were 88–91%. These results imply that the types of membrane proteins, as well as their cellular locations and other attributes, are closely correlated with their amino acid composition. It is anticipated that the classification schemes and prediction algorithm can expedite the functionality determination of new proteins. The concept and method can be also useful in the prioritization of genes and proteins identified by genomics efforts as potential molecular targets for drug design. Proteins 1999;34:137–153. © 1999 Wiley‐Liss, Inc.

[1]  K Nishikawa,et al.  The folding type of a protein is relevant to the amino acid composition. , 1986, Journal of biochemistry.

[2]  B. Rost,et al.  Transmembrane helices predicted at 95% accuracy , 1995, Protein science : a publication of the Protein Society.

[3]  K. Chou,et al.  Prediction and classification of domain structural classes , 1998, Proteins.

[4]  M. Sternberg Prediction of protein structure and the principles of protein conformation , 1990 .

[5]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[6]  J. Davies,et al.  Molecular Biology of the Cell , 1983, Bristol Medico-Chirurgical Journal.

[7]  P. Y. Chou,et al.  Prediction of Protein Structural Classes from Amino Acid Compositions , 1989 .

[8]  K. Chou A novel approach to predicting protein structural classes in a (20–1)‐D amino acid composition space , 1995, Proteins.

[9]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[10]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[11]  P. Casey,et al.  Protein lipidation in cell signaling. , 1995, Science.

[12]  M. Resh,et al.  Myristylation and palmitylation of Src family members: The fats of the matter , 1994, Cell.

[13]  K. Chou,et al.  Protein subcellular location prediction. , 1999, Protein engineering.

[14]  P. Mahalanobis On the generalized distance in statistics , 1936 .