In silico prediction of the peroxisomal proteome in fungi, plants and animals.

In an attempt to improve our abilities to predict peroxisomal proteins, we have combined machine-learning techniques for analyzing peroxisomal targeting signals (PTS1) with domain-based cross-species comparisons between eight eukaryotic genomes. Our results indicate that this combined approach has a significantly higher specificity than earlier attempts to predict peroxisomal localization, without a loss in sensitivity. This allowed us to predict 430 peroxisomal proteins that almost completely lack a localization annotation. These proteins can be grouped into 29 families covering most of the known steps in all known peroxisomal pathways. In general, plants have the highest number of predicted peroxisomal proteins, and fungi the smallest number.

[1]  S. Subramani,et al.  A novel, cleavable peroxisomal targeting signal at the amino‐terminus of the rat 3‐ketoacyl‐CoA thiolase. , 1991, The EMBO journal.

[2]  S Subramani,et al.  A conserved tripeptide sorts proteins to peroxisomes , 1989, The Journal of cell biology.

[3]  R. Wanders,et al.  Human alkyldihydroxyacetonephosphate synthase deficiency: A new peroxisomal disorder , 1994, Journal of Inherited Metabolic Disease.

[4]  Gerbert A. Jansen,et al.  Peroxisomal fatty acid α- and β-oxidation in humans: enzymology, peroxisomal metabolite transporters and peroxisomal diseases , 2001 .

[5]  Paul Horton,et al.  Better Prediction of Protein Cellular Localization Sites with the it k Nearest Neighbors Classifier , 1997, ISMB.

[6]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[7]  M. Hayashi,et al.  A Novel Acyl-CoA Oxidase That Can Oxidize Short-chain Acyl-CoA in Plant Peroxisomes* , 1999, The Journal of Biological Chemistry.

[8]  S. Subramani,et al.  Protein import into peroxisomes and biogenesis of the organelle. , 1993, Annual review of cell biology.

[9]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[10]  Y. Fujiki,et al.  Biogenesis of peroxisomes. , 1985, Annual review of cell biology.

[11]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[12]  J. Harada,et al.  Targeting of glyoxysomal proteins to peroxisomes in leaves and roots of a higher plant. , 1993, The Plant cell.

[13]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[14]  S Subramani,et al.  Import of peroxisomal matrix and membrane proteins. , 2000, Annual review of biochemistry.

[15]  V. Titorenko,et al.  The life cycle of the peroxisome , 2001, Nature Reviews Molecular Cell Biology.

[16]  N. Aboushadi,et al.  Role of Peroxisomes in Isoprenoid Biosynthesis , 1999, The journal of histochemistry and cytochemistry : official journal of the Histochemistry Society.

[17]  H. Kornberg The role and control of the glyoxylate cycle in Escherichia coli. , 1966, The Biochemical journal.

[18]  P. Eastmond,et al.  Pathways of straight and branched chain fatty acid catabolism in higher plants. , 2002, Progress in lipid research.

[19]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[20]  S. M. Choe,et al.  Conservative amino acid substitutions of the C-terminal tripeptide (Ala-Arg-Met) on cottonseed isocitrate lyase preserve import in vivo into mammalian cell peroxisomes. , 1994, European journal of cell biology.

[21]  Robert D. Finn,et al.  The Pfam protein families database , 2004, Nucleic Acids Res..

[22]  S. Krisans,et al.  Peroxisomal protein targeting and identification of peroxisomal targeting signals in cholesterol biosynthetic enzymes. , 2000, Biochimica et biophysica acta.

[23]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[24]  Peer Bork,et al.  Predicting protein cellular localization using a domain projection method. , 2002, Genome research.

[25]  Amos Bairoch,et al.  The PROSITE database, its status in 1997 , 1997, Nucleic Acids Res..

[26]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[27]  Arne Elofsson,et al.  The Use of Phylogenetic Profiles for Gene Predictions , 2002 .

[28]  J Lundström,et al.  Pcons: A neural‐network–based consensus predictor that improves fold recognition , 2001, Protein science : a publication of the Protein Society.

[29]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[30]  K. Chou,et al.  Support vector machines for prediction of protein subcellular location by incorporating quasi‐sequence‐order effect , 2002, Journal of cellular biochemistry.

[31]  P. Lazarow,et al.  Peroxisome biogenesis. , 2001, Annual review of cell and developmental biology.

[32]  M. Hayashi,et al.  Proteomic analysis of leaf peroxisomal proteins in greening cotyledons of Arabidopsis thaliana. , 2002, Plant & cell physiology.

[33]  Amos Bairoch,et al.  A Generalized Profile Syntax for Biomolecular Sequence Motifs and its Function in Automatic Sequence Interpretation , 1994, ISMB.

[34]  Temple F. Smith,et al.  The ancient regulatory-protein family of WD-repeat proteins , 1994, Nature.

[35]  F. Opperdoes,et al.  Localization of nine glycolytic enzymes in a microbody‐like organelle in Trypanosoma brucei: The glycosome , 1977, FEBS letters.

[36]  Thomas L. Madden,et al.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. , 2001, Nucleic acids research.

[37]  K. Chou,et al.  Support vector machines for prediction of protein subcellular location. , 2000, Molecular cell biology research communications : MCBRC.

[38]  Amos Bairoch,et al.  The PROSITE database, its status in 2002 , 2002, Nucleic Acids Res..

[39]  J. Hiltunen,et al.  Response of SCP-2L domain of human MFE-2 to ligand removal: binding site closure and burial of peroxisomal targeting signal. , 2002, Journal of molecular biology.

[40]  S. Brunak,et al.  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. , 2000, Journal of molecular biology.

[41]  Stavros J. Perantonis,et al.  Efficient perceptron learning using constrained steepest descent , 2000, Neural Networks.

[42]  M. Kanehisa,et al.  A knowledge base for predicting protein localization sites in eukaryotic cells , 1992, Genomics.

[43]  R. Wanders,et al.  Biochemistry of peroxisomes. , 1992, Annual review of biochemistry.

[44]  P. Rehling,et al.  Involvement of Pex13p in Pex14p Localization and Peroxisomal Targeting Signal 2–dependent Protein Import into Peroxisomes , 1999, The Journal of cell biology.

[45]  D. Eisenberg,et al.  Localizing proteins in the cell from their phylogenetic profiles. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Jeremy M. Berg,et al.  Molecular dynamics simulations of biomolecules , 2002, Nature Structural Biology.

[47]  H. Schulz,et al.  beta-oxidation of fatty acids in mitochondria, peroxisomes, and bacteria: a century of continued progress. , 1995, Progress in lipid research.

[48]  P. Rehling,et al.  Pex14p, a Peroxisomal Membrane Protein Binding Both Receptors of the Two PTS-Dependent Import Pathways , 1997, Cell.

[49]  A. Sickmann,et al.  Identification of peroxisomal membrane proteins of Saccharomyces cerevisiae by mass spectrometry , 2001, Electrophoresis.

[50]  M. Piskacek,et al.  Predicting the Function and Subcellular Location of Caenorhabditis Elegans Proteins Similar to Saccharomyces Cerevisiae β-Oxidation Enzymes , 2000, Yeast.

[51]  E. Farmer,et al.  Fatty acid signaling in Arabidopsis , 1998, Planta.

[52]  S. Gould,et al.  Peroxisomal-protein import: is it really that complex? , 2002, Nature Reviews Molecular Cell Biology.

[53]  S. Krisans,et al.  Central role of peroxisomes in isoprenoid biosynthesis. , 2002, Progress in lipid research.

[54]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[55]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[56]  C. Scriver,et al.  The Metabolic and Molecular Bases of Inherited Disease, 8th Edition 2001 , 2001, Journal of Inherited Metabolic Disease.

[57]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[58]  M. Fransen,et al.  The Difference in Recognition of Terminal Tripeptides as Peroxisomal Targeting Signal 1 between Yeast and Human Is Due to Different Affinities of Their Receptor Pex5p to the Cognate Signal and to Residues Adjacent to It* , 1998, The Journal of Biological Chemistry.

[59]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[60]  T. Gaasterland,et al.  Microbial genescapes: phyletic and functional patterns of ORF distribution among prokaryotes. , 1998, Microbial & comparative genomics.

[61]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[62]  Honey Chan,et al.  Acyl-CoA oxidase is imported as a heteropentameric, cofactor-containing complex into peroxisomes of Yarrowia lipolytica , 2002, The Journal of cell biology.

[63]  A. Koller,et al.  Analysis of the peroxisomal acyl‐CoA oxidase gene product from Pichia pastoris and determination of its targeting signal , 1999, Yeast.