SCLpred-EMS: subcellular localization prediction of endomembrane system and secretory pathway proteins by Deep N-to-1 Convolutional Neural Networks

MOTIVATION The subcellular location of a protein can provide useful information for protein function prediction and drug design. Experimentally determining the subcellular location of a protein is an expensive and time-consuming task. Therefore, various computer-based tools have been developed, mostly using machine learning algorithms, to predict the subcellular location of proteins. RESULTS Here, we present a neural network based algorithm for protein subcellular location prediction. We introduce SCLpred-EMS a subcellular localization predictor powered by an ensemble of Deep N-to-1 Convolutional Neural Networks. SCLpred-EMS predicts the subcellular location of a protein into two classes, the endomembrane system and secretory pathway versus all others, with an MCC of 0.75-0.86 outperforming the other state-of-the-art web servers we tested. AVAILABILITY SCLpred-EMS is freely available for academic users at http://distilldeep.ucd.ie/SCLpred2/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  L. Rajendran,et al.  Subcellular targeting strategies for drug design and delivery , 2010, Nature Reviews Drug Discovery.

[2]  Pier Luigi Martelli,et al.  MemLoci: predicting subcellular localization of membrane proteins in eukaryotes , 2011, Bioinform..

[3]  Konstantinos D. Tsirigos,et al.  SignalP 5.0 improves signal peptide predictions using deep neural networks , 2019, Nature Biotechnology.

[4]  Burkhard Rost,et al.  LocTree3 prediction of localization , 2014, Nucleic Acids Res..

[5]  Gianluca Pollastri,et al.  PaleAle 5.0: prediction of protein relative solvent accessibility by deep learning , 2019, Amino Acids.

[6]  Gianluca Pollastri,et al.  SCL-Epred: a generalised de novo eukaryotic protein subcellular localisation predictor , 2013, Amino Acids.

[7]  Burkhard Rost,et al.  Supporting online material for : LocTree 2 predicts localization for all domains of life , 2012 .

[8]  Gianluca Pollastri,et al.  Deeper Profiles and Cascaded Recurrent and Convolutional Neural Networks for state-of-the-art Protein Secondary Structure Prediction , 2019, Scientific Reports.

[9]  Piero Fariselli,et al.  MemPype: a pipeline for the annotation of eukaryotic membrane proteins , 2011, Nucleic Acids Res..

[10]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[11]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[12]  Jenn-Kang Hwang,et al.  CELLO2GO: A Web Server for Protein subCELlular LOcalization Prediction with Functional Gene Ontology Annotation , 2014, PloS one.

[13]  Piero Fariselli,et al.  DeepSig: deep learning improves signal peptide detection in proteins , 2017, Bioinform..

[14]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[15]  Ole Winther,et al.  DeepLoc: prediction of protein subcellular localization using deep learning , 2017, Bioinform..

[16]  Piero Fariselli,et al.  BUSCA: an integrative web server to predict subcellular localization of proteins , 2018, Nucleic Acids Res..

[17]  M. Tabaton,et al.  Subcellular localization of amyloid precursor protein in senile plaques of Alzheimer's disease. , 1992, The American journal of pathology.

[18]  Piero Fariselli,et al.  TPpred3 detects and discriminates mitochondrial and chloroplastic targeting peptides in eukaryotic proteins , 2015, Bioinform..

[19]  Carlos Augusto Real Martinez,et al.  Remission in Crohn’s disease is accompanied by alterations in the gut microbiota and mucins production , 2019, Scientific Reports.

[20]  Piero Fariselli,et al.  BaCelLo: a balanced subcellular localization predictor , 2006, ISMB.

[21]  Jenn-Kang Hwang,et al.  Prediction of protein subcellular localization , 2006, Proteins.