TARGET: a new method for predicting protein subcellular localization in eukaryotes

Motivation: There is a scarcity of efficient computational methods for predicting protein subcellular localization in eukaryotes. Currently available methods are inadequate for genome-scale predictions with several limitations. Here, we present a new prediction method, pTARGET that can predict proteins targeted to nine different subcellular locations in the eukaryotic animal species. Results: The nine subcellular locations predicted by pTARGET include cytoplasm, endoplasmic reticulum, extracellular/secretory, golgi, lysosomes, mitochondria, nucleus, plasma membrane and peroxisomes. Predictions are based on the location-specific protein functional domains and the amino acid compositional differences across different subcellular locations. Overall, this method can predict 68--87% of the true positives at accuracy rates of 96--99%. Comparison of the prediction performance against PSORT showed that pTARGET prediction rates are higher by 11--60% in 6 of the 8 locations tested. Besides, the pTARGET method is robust enough for genome-scale prediction of protein subcellular localizations since, it does not rely on the presence of signal or target peptides. Availability: A public web server based on the pTARGET method is accessible at the URL http://bioinformatics.albany.edu/~ptarget. Datasets used for developing pTARGET can be downloaded from this web server. Source code will be available on request from the corresponding author. Contact: [email protected] Supplementary data: Accessible as online-only from the publisher.

[1]  G. Vonheijne The signal peptide. , 1990 .

[2]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[3]  G. von Heijne The signal peptide. , 1990, The Journal of membrane biology.

[4]  M. Gerstein,et al.  Subcellular localization of the yeast proteome. , 2002, Genes & development.

[5]  L. Gierasch Signal sequences. , 1989, Biochemistry.

[6]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[7]  D A Kendall,et al.  Protein transport via amino-terminal targeting sequences: common themes in diverse systems. , 1995, Molecular membrane biology.

[8]  P. Aloy,et al.  Relation between amino acid composition and cellular location of proteins. , 1997, Journal of molecular biology.

[9]  S. Brunak,et al.  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. , 2000, Journal of molecular biology.

[10]  B. Rost,et al.  Better prediction of sub‐cellular localization by combining evolutionary and structural information , 2003, Proteins.

[11]  T. Hubbard,et al.  Using neural networks for prediction of the subcellular location of proteins. , 1998, Nucleic acids research.

[12]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[13]  C. Zhang,et al.  Prediction of the subcellular location of prokaryotic proteins based on the hydrophobicity index of amino acids. , 2001, International journal of biological macromolecules.

[14]  Tianzi Jiang,et al.  Esub8: A novel tool to predict protein subcellular localizations in eukaryotic organisms , 2004, BMC Bioinformatics.

[15]  Adam Godzik,et al.  Clustering of highly homologous sequences to reduce the size of large protein databases , 2001, Bioinform..

[16]  Burkhard Rost,et al.  Inferring sub-cellular localization through automated lexical analysis , 2002, ISMB.

[17]  S Subramani,et al.  Import of peroxisomal matrix and membrane proteins. , 2000, Annual review of biochemistry.

[18]  K. Nakai,et al.  PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. , 1999, Trends in biochemical sciences.

[19]  Chittibabu Guda,et al.  SledgeHMMER: a web server for batch searching the Pfam database , 2004, Nucleic Acids Res..

[20]  D. Eisenberg,et al.  Localizing proteins in the cell from their phylogenetic profiles. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[21]  T. Rapoport Transport of proteins across the endoplasmic reticulum membrane. , 1992, Science.

[22]  Eoin Fahy,et al.  MITOPRED: a genome-scale method for prediction of nucleus-encoded mitochondrial proteins , 2004, Bioinform..

[23]  Z. Feng,et al.  Prediction of the subcellular location of prokaryotic proteins based on a new representation of the amino acid composition. , 2001, Biopolymers.

[24]  Peer Bork,et al.  Predicting protein cellular localization using a domain projection method. , 2002, Genome research.

[25]  S. Brunak,et al.  SHORT COMMUNICATION Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites , 1997 .

[26]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[27]  Zhirong Sun,et al.  Support vector machine approach for protein subcellular localization prediction , 2001, Bioinform..

[28]  N. Borgese,et al.  KDEL and KKXX retrieval signals appended to the same reporter protein determine different trafficking between endoplasmic reticulum, intermediate compartment, and Golgi complex. , 2003, Molecular biology of the cell.