YLoc—an interpretable web server for predicting subcellular localization

Predicting subcellular localization has become a valuable alternative to time-consuming experimental methods. Major drawbacks of many of these predictors is their lack of interpretability and the fact that they do not provide an estimate of the confidence of an individual prediction. We present YLoc, an interpretable web server for predicting subcellular localization. YLoc uses natural language to explain why a prediction was made and which biological property of the protein was mainly responsible for it. In addition, YLoc estimates the reliability of its own predictions. YLoc can, thus, assist in understanding protein localization and in location engineering of proteins. The YLoc web server is available online at www.multiloc.org/YLoc.

[1]  Hagit Shatkay,et al.  SherLoc2: a high-accuracy hybrid method for predicting subcellular localization of proteins. , 2009, Journal of proteome research.

[2]  K. Chou,et al.  Euk-mPLoc: a fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites. , 2007, Journal of proteome research.

[3]  K. Chou,et al.  Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular Location* , 2002, The Journal of Biological Chemistry.

[4]  R. Casadio,et al.  BaCelLo: a Balanced subCellular Localization predictor. , 2007 .

[5]  Hagit Shatkay,et al.  Pacific Symposium on Biocomputing 13:604-615(2008) EPILOC: A (WORKING) TEXT-BASED SYSTEM FOR PREDICTING PROTEIN SUBCELLULAR LOCATION , 2022 .

[6]  Oliver Kohlbacher,et al.  MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction , 2009, BMC Bioinformatics.

[7]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[8]  B. Rost,et al.  Mimicking cellular sorting improves prediction of subcellular localization. , 2005, Journal of molecular biology.

[9]  Song Zhang,et al.  DBMLoc: a Database of proteins with multiple subcellular localizations , 2008, BMC Bioinformatics.

[10]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[11]  Ian Craig,et al.  Presence of two forms of fumarase (fumarate hydratase E.C. 4.2.1.2) in mammalian cells: Immunological characterization and genetic analysis in somatic cell hybrids. Confirmation of the assignment of a gene necessary for the enzyme expression to human chromosome 1 , 1975, Biochemical Genetics.

[12]  Michael T. Hallett,et al.  Refining Protein Subcellular Localization , 2005, PLoS Comput. Biol..

[13]  Duane Szafron,et al.  Improving subcellular localization prediction using text classification and the gene ontology , 2008, Bioinform..

[14]  Shiow-Fen Hwang,et al.  ProLoc-GO: Utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization , 2008, BMC Bioinformatics.

[15]  Piero Fariselli,et al.  BaCelLo: a balanced subcellular localization predictor , 2006, ISMB.

[16]  Oliver Kohlbacher,et al.  MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition , 2006, Bioinform..

[17]  Gajendra P. S. Raghava,et al.  ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins , 2008, BMC Bioinformatics.

[18]  R. Casadio,et al.  The prediction of protein subcellular localization from sequence: a shortcut to functional genome annotation. , 2008, Briefings in functional genomics & proteomics.

[19]  Stavros J. Hamodrakas,et al.  PredSL: A Tool for the N-terminal Sequence-based Prediction of Protein Subcellular Localization , 2006, Genom. Proteom. Bioinform..

[20]  M. Kanehisa,et al.  A knowledge base for predicting protein localization sites in eukaryotic cells , 1992, Genomics.

[21]  Oliver Kohlbacher,et al.  Going from where to why—interpretable prediction of protein subcellular localization , 2010, Bioinform..

[22]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[23]  Michelle S. Scott,et al.  Predicting subcellular localization via protein motif co-occurrence. , 2004, Genome research.

[24]  Paul Horton,et al.  Better Prediction of Protein Cellular Localization Sites with the it k Nearest Neighbors Classifier , 1997, ISMB.

[25]  Wen-Lian Hsu,et al.  Protein subcellular localization prediction of eukaryotes using a knowledge-based approach , 2009 .

[26]  S. Brunak,et al.  Locating proteins in the cell using TargetP, SignalP and related tools , 2007, Nature Protocols.

[27]  Kuo-Chen Chou,et al.  A new hybrid approach to predict subcellular localization of proteins by incorporating gene ontology. , 2003, Biochemical and biophysical research communications.

[28]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[29]  Kuo-Chen Chou,et al.  Prediction and classification of protein subcellular location—sequence‐order effect and pseudo amino acid composition , 2003, Journal of cellular biochemistry.

[30]  Paul Horton,et al.  Nucleic Acids Research Advance Access published May 21, 2007 WoLF PSORT: protein localization predictor , 2007 .