LumenP—A neural network predictor for protein localization in the thylakoid lumen

We report the development of LumenP, a new neural network‐based predictor for the identification of proteins targeted to the thylakoid lumen of plant chloroplasts and prediction of their cleavage sites. When used together with the previously developed TargetP predictor, LumenP reaches a significantly better performance than what has been recorded for previous attempts at predicting thylakoid lumen location, mostly due to a lower false positive rate. The combination of TargetP and LumenP predicts around 1.5%–3% of all proteins encoded in the genomes of Arabidopsis thaliana and Oryza sativa to be located in the lumen of the thylakoid.

[1]  Anders Krogh,et al.  Prediction of Signal Peptides and Signal Anchors by a Hidden Markov Model , 1998, ISMB.

[2]  S. Brunak,et al.  SHORT COMMUNICATION Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites , 1997 .

[3]  G von Heijne,et al.  Patterns of amino acids near signal-sequence cleavage sites. , 1983, European journal of biochemistry.

[4]  C. Howe,et al.  Prediction of leader peptide cleavage sites for polypeptides of the thylakoid lumen. , 1990, Nucleic acids research.

[5]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[6]  A. Mant,et al.  Multiple pathways for the targeting of thylakoid proteins in chloroplasts , 1998, Plant Molecular Biology.

[7]  A. Oliphant,et al.  A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). , 2002, Science.

[8]  Maria Jesus Martin,et al.  High-quality Protein Knowledge Resource: SWISS-PROT and TrEMBL , 2002, Briefings Bioinform..

[9]  G. Friso,et al.  Proteomics of the Chloroplast: Systematic Identification and Targeting Analysis of Lumenal and Peripheral Thylakoid Proteins , 2000, Plant Cell.

[10]  M. Kanehisa,et al.  A knowledge base for predicting protein localization sites in eukaryotic cells , 1992, Genomics.

[11]  Peter Roepstorff,et al.  Central Functions of the Lumenal and Peripheral Thylakoid Proteome of Arabidopsis Determined by Experimentation and Genome-Wide Prediction Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1105/tpc.010304. , 2002, The Plant Cell Online.

[12]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[13]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[14]  R. Zimmermann,et al.  The reaction specificities of the thylakoidal processing peptidase and Escherichia coli leader peptidase are identical. , 1989, The EMBO journal.

[15]  Anders Gorm Pedersen,et al.  Neural Network Prediction of Translation Initiation Sites in Eukaryotes: Perspectives for EST and Genome Analysis , 1997, ISMB.

[16]  G. von Heijne,et al.  Chloroplast transit peptides from the green alga Chlamydomonas reinhardtii share features with both mitochondrial and higher plant chloroplast presequences , 1990, FEBS letters.

[17]  Huanming Yang,et al.  A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. indica) , 2002, Science.

[18]  B. Haas,et al.  Proteome Map of the Chloroplast Lumen of Arabidopsis thaliana * , 2002, The Journal of Biological Chemistry.

[19]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[20]  G von Heijne,et al.  Prediction of organellar targeting signals. , 2001, Biochimica et biophysica acta.

[21]  S. Brunak,et al.  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. , 2000, Journal of molecular biology.

[22]  Stavros J. Perantonis,et al.  Efficient perceptron learning using constrained steepest descent , 2000, Neural Networks.

[23]  W. Pearson Rapid and sensitive sequence comparison with FASTP and FASTA. , 1990, Methods in enzymology.

[24]  G. Heijne,et al.  ChloroP, a neural network‐based method for predicting chloroplast transit peptides and their cleavage sites , 1999, Protein science : a publication of the Protein Society.