LocTree3 prediction of localization

The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction. LocTree2 has performed on par with or better than any other state-of-the-art method. Here, we report the availability of LocTree3 as a public web server. The server includes the machine learning-based LocTree2 and improves over it through the addition of homology-based inference. Assessed on sequence-unique data, LocTree3 reached an 18-state accuracy Q18 = 80 ± 3% for eukaryotes and a six-state accuracy Q6 = 89 ± 4% for bacteria. The server accepts submissions ranging from single protein sequences to entire proteomes. Response time of the unloaded server is about 90 s for a 300-residue eukaryotic protein and a few hours for an entire eukaryotic proteome not considering the generation of the alignments. For over 1000 entirely sequenced organisms, the predictions are directly available as downloads. The web server is available at http://www.rostlab.org/services/loctree3.

Kieu Trinh Do | Katharina M. Hembach | B. Rost | Tobias Hamp | M. Hecht | Henrik Nielsen | P. Angerer | Michael Bernhofer | Jonas Reeb | Tatyana Goldberg | Maria Kalemanov | J. Zierer | Guy Yachdav | Tim Karl | B. Rost | Nadeem Ahmed | Max Herzog | M. Hastreiter | Michael Kluge | Sonja Ansorge | R. Greil | Alice Meier | Ilira Troshani | Susann Vorberg | Nadeem Ahmed | Uwe Altermann | Kinga Balasz | Alexander Betz | Laura Cizmadija | Julia Gerke | Vadim Joerdens | Michael Kluge | Hassan Nasir | Ulrich Neumaier | Verena Prade | Aleksandr Sorokoumov | Sonja Waldraff | Guy Yachdav | Tatyana Goldberg | Tobias Hamp | Michael Bernhofer | Maximilian Hecht | Timothy Karl | Uwe Altermann | Sonja Ansorge | Kinga Balasz | Alexander Betz | Laura Cizmadija | Kieu Trinh Do | Julia Gerke | Robert Greil | Vadim Joerdens | Maximilian Hastreiter | Katharina Hembach | Max Herzog | Maria Kalemanov | Alice Meier | Hassan Nasir | Ulrich Neumaier | Verena Prade | Jonas Reeb | Aleksandr Sorokoumov | Ilira Troshani | Susann Vorberg | Sonja Waldraff | Jonas Zierer | Tatyana Goldberg | Maximilian Hecht | Nadeem Ahmed | Uwe Altermann | Philipp Angerer | Sonja Ansorge | Kinga Balasz | Alexander Betz | Laura Cizmadija | Kieu Trinh Do | Julia Gerke | Robert Greil | Vadim Joerdens | Maximilian Hastreiter | Katharina Hembach | Max Herzog | Maria Kalemanov | Michael Kluge | Alice Meier | Hassan Nasir | Ulrich Neumaier | Verena Prade | Aleksandr Sorokoumov | Susann Vorberg | Sonja Waldraff | Jonas Zierer | Henrik Nielsen | Maximilian Hecht | T. Hamp | Burkhard Rost | Nadeem Ahmed | Uwe Altermann | Philipp Angerer | Sonja Ansorge | Kinga Balasz | Alexander Betz | Laura Cizmadija | Kieu Trinh Do | Julia Gerke | R. Greil | Vadim Joerdens | Maximilian Hastreiter | Katharina Hembach | Max Herzog | Maria Kalemanov | Michael Kluge | Alice Meier | Hassan Nasir | Ulrich Neumaier | Verena Prade | Aleksandr Sorokoumov | Susann Vorberg | Sonja Waldraff | Jonas Zierer | Henrik Nielsen

[1]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[2]  B. Rost,et al.  PredictProtein—an open resource for online prediction of protein structural and functional features , 2014, Nucleic Acids Res..

[3]  László Kaján,et al.  Cloud Prediction of Protein Structure and Function with PredictProtein for Debian , 2013, BioMed research international.

[4]  B. Rost,et al.  Accelerating the Original Profile Kernel , 2013, PloS one.

[5]  G. Pollastri,et al.  SCL-Epred: a generalised de novo eukaryotic protein subcellular localisation predictor , 2013, Amino Acids.

[6]  Daniel W. A. Buchan,et al.  A large-scale evaluation of computational protein function prediction , 2013, Nature Methods.

[7]  Rachael P. Huntley,et al.  The UniProt-GO Annotation database in 2011 , 2011, Nucleic Acids Res..

[8]  B. Garcia,et al.  Proteomics , 2011, Journal of biomedicine & biotechnology.

[9]  K. Nakai,et al.  Prediction of subcellular locations of proteins: Where to proceed? , 2010, Proteomics.

[10]  Oliver Kohlbacher,et al.  YLoc—an interpretable web server for predicting subcellular localization , 2010, Nucleic Acids Res..

[11]  Martin Ester,et al.  PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes , 2010, Bioinform..

[12]  A. Sobel,et al.  The Journal of Biological Chemistry. , 2009, Nutrition reviews.

[13]  David A. Lee,et al.  Predicting protein function from sequence and structure , 2007, Nature Reviews Molecular Cell Biology.

[14]  C. J. Adams-Collier,et al.  WoLF PSORT: protein localization predictor , 2007, Nucleic Acids Res..

[15]  Jenn-Kang Hwang,et al.  Prediction of protein subcellular localization , 2006, Proteins.

[16]  Ke Wang,et al.  Profile-based string kernels for remote homology detection and motif extraction , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[17]  Y. Freund,et al.  Profile-based string kernels for remote homology detection and motif extraction. , 2005, Journal of bioinformatics and computational biology.

[18]  Jeremy C Simpson,et al.  Localizing the proteome , 2003, Genome Biology.

[19]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[20]  Burkhard Rost,et al.  UniqueProt: creating representative protein sequence sets , 2003, Nucleic Acids Res..

[21]  B. Rost Enzyme function less conserved than anticipated. , 2002, Journal of molecular biology.

[22]  B. Rost,et al.  Alignments grow, secondary structure prediction improves , 2002, Proteins.

[23]  B. Rost Twilight zone of protein sequence alignments. , 1999, Protein engineering.

[24]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[25]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[26]  T. Creighton Methods in Enzymology , 1968, The Yale Journal of Biology and Medicine.

[27]  Anthony W. Newman Free Radical Biology & Medicine , 2005 .

[28]  J. Ferraris,et al.  SUPPORTING ONLINE MATERIAL , 2004 .

[29]  J. A. Buso,et al.  BMC Plant Biology , 2003 .

[30]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[31]  E V Koonin,et al.  Bridging the gap between sequence and function. , 2000, Trends in genetics : TIG.

[32]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[33]  S F Altschul,et al.  Local alignment statistics. , 1996, Methods in enzymology.