Supervised Ensembles of Prediction Methods for Subcellular Localization

In the past decade, many automated prediction methods for the subcellular localization of proteins have been proposed, utilizing a wide range of principles and learning approaches. Based on an experimental evaluation of different methods and their theoretical properties, we propose to combine a well-balanced set of existing approaches to new, ensemble-based prediction methods. The experimental evaluation shows that our ensembles improve substantially over the underlying base methods.

[1]  Gajendra P. S. Raghava,et al.  PSLpred: prediction of subcellular localization of bacterial proteins , 2005, Bioinform..

[2]  Ying Huang,et al.  Prediction of protein subcellular locations using fuzzy k-NN method , 2004, Bioinform..

[3]  Zhirong Sun,et al.  Support vector machine approach for protein subcellular localization prediction , 2001, Bioinform..

[4]  Jenn-Kang Hwang,et al.  Predicting subcellular localization of proteins for Gram‐negative bacteria by support vector machines based on n‐peptide compositions , 2004, Protein science : a publication of the Protein Society.

[5]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[6]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[7]  P Vincens,et al.  Computational method to predict mitochondrially imported proteins and their targeting sequences. , 1996, European journal of biochemistry.

[8]  S. Brunak,et al.  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. , 2000, Journal of molecular biology.

[9]  T. Hubbard,et al.  Using neural networks for prediction of the subcellular location of proteins. , 1998, Nucleic acids research.

[10]  K Nishikawa,et al.  Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies. , 1994, Journal of molecular biology.

[11]  Satoru Miyano,et al.  Extensive feature detection of N-terminal protein sorting signals , 2002, Bioinform..

[12]  Gajendra P. S. Raghava,et al.  ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST , 2004, Nucleic Acids Res..

[13]  F. Legeai,et al.  Predotar: A tool for rapidly screening proteomes for N‐terminal targeting sequences , 2004, Proteomics.

[14]  Minoru Kanehisa,et al.  Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs , 2003, Bioinform..

[15]  M. Bhasin,et al.  Support Vector Machine-based Method for Subcellular Localization of Human Proteins Using Amino Acid Compositions, Their Order, and Similarity Search* , 2005, Journal of Biological Chemistry.

[16]  Paul Horton,et al.  PROTEIN SUBCELLULAR LOCALIZATION PREDICTION WITH WOLF PSORT , 2005 .

[17]  S. Brunak,et al.  Improved prediction of signal peptides: SignalP 3.0. , 2004, Journal of molecular biology.

[18]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt): an expanding universe of protein information , 2005, Nucleic Acids Res..

[19]  Nikunj C. Oza,et al.  Online Ensemble Learning , 2000, AAAI/IAAI.

[20]  Ke Wang,et al.  PSORT-B: improving protein subcellular localization prediction for Gram-negative bacteria , 2003, Nucleic Acids Res..

[21]  G. Heijne A new method for predicting signal sequence cleavage sites. , 1986 .

[22]  K. Nakai,et al.  PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. , 1999, Trends in biochemical sciences.

[23]  Pierre Dönnes,et al.  Predicting Protein Subcellular Localization: Past, Present, and Future , 2004, Genomics, proteomics & bioinformatics.

[24]  Giorgio Valentini,et al.  Ensembles of Learning Machines , 2002, WIRN.

[25]  M. Kanehisa,et al.  A knowledge base for predicting protein localization sites in eukaryotic cells , 1992, Genomics.

[26]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[27]  Eoin Fahy,et al.  MITOPRED: a genome-scale method for prediction of nucleus-encoded mitochondrial proteins , 2004, Bioinform..

[28]  Zhiyong Lu,et al.  Predicting subcellular localization of proteins using machine-learned classifiers , 2004, Bioinform..

[29]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[30]  John Hawkins,et al.  Prediction of subcellular localization using sequence-biased recurrent networks , 2005, Bioinform..

[31]  B. Rost,et al.  Finding nuclear localization signals , 2000, EMBO reports.

[32]  Martin Ester,et al.  Sequence analysis PSORTb v . 2 . 0 : Expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis , 2004 .

[33]  Oliver Kohlbacher,et al.  MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition , 2006, Bioinform..