Computational identification of vesicular transport proteins from sequences using deep gated recurrent units architecture

Graphical abstract

[1]  Nitin J. Karandikar,et al.  The mechanism of action of glatiramer acetate treatment in multiple sclerosis , 2010, Neurology.

[2]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[3]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[4]  H. Aronsson,et al.  Bioinformatic Indications That COPI- and Clathrin-Based Transport Systems Are Not Present in Chloroplasts: An Arabidopsis Model , 2014, PloS one.

[5]  Hiroyuki Ogata,et al.  AAindex: Amino Acid Index Database , 1999, Nucleic Acids Res..

[6]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[7]  Yu-Dong Cai,et al.  Classification of Widely and Rarely Expressed Genes with Recurrent Neural Network , 2018, Computational and structural biotechnology journal.

[8]  Jeffrey L. Mendenhall,et al.  Protein structure aids predicting functional perturbation of missense variants in SCN5A and KCNQ1 , 2019, Computational and structural biotechnology journal.

[9]  Nguyen Quoc Khanh Le,et al.  Fertility-GRU: Identifying fertility-related proteins by incorporating deep gated recurrent units and original PSSM profiles. , 2019, Journal of proteome research.

[10]  Zhen Li,et al.  Protein Secondary Structure Prediction Using Cascaded Convolutional and Recurrent Neural Networks , 2016, IJCAI.

[11]  Yu-Yen Ou,et al.  iEnhancer-5Step: Identifying enhancers using hidden information of DNA sequences via Chou's 5-step rule and word embedding. , 2019, Analytical biochemistry.

[12]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[13]  Yu-Yen Ou,et al.  Prediction of FAD binding sites in electron transport proteins according to efficient radial basis function networks and significant amino acid pairs , 2016, BMC Bioinformatics.

[14]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[15]  D. C. Barral,et al.  Membrane Traffic and Disease , 2014 .

[16]  Donghong Ji,et al.  Long short-term memory RNN for biomedical named entity recognition , 2017, BMC Bioinformatics.

[17]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[18]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[19]  D. Bader,et al.  Identification of a novel Bves function: regulation of vesicular transport , 2010, The EMBO journal.

[20]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[21]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[22]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[23]  David A. Hendrix,et al.  A Deep Recurrent Neural Network Discovers Complex Biological Rules to Decipher RNA Protein-Coding Potential , 2017 .

[24]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[25]  P. Gissen,et al.  Cargos and genes: insights into vesicular transport from inherited human disease , 2007, Journal of Medical Genetics.

[26]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[27]  Fabien Campagne,et al.  Gene Expression Profiling Separates Chromophobe Renal Cell Carcinoma from Oncocytoma and Identifies Vesicular Transport and Cell Junction Proteins as Differentially Expressed Genes , 2006, Clinical Cancer Research.

[28]  Yu-Yen Ou,et al.  iMotor-CNN: Identifying molecular functions of cytoskeleton motor proteins using 2D convolutional neural network via Chou's 5-step rule. , 2019, Analytical biochemistry.

[29]  Nguyen-Quoc-Khanh Le,et al.  SNARE-CNN: a 2D convolutional neural network architecture to identify SNARE proteins from high-throughput sequencing data , 2019, PeerJ Computer Science.

[30]  R. Edwards,et al.  The role of vesicular transport proteins in synaptic transmission and neural degeneration. , 1997, Annual review of neuroscience.

[31]  S. Dua,et al.  Residue Adjacency Matrix Based Feature Engineering for Predicting Cysteine Reactivity in Proteins , 2018, Computational and structural biotechnology journal.

[32]  Nguyen Quoc Khanh Le,et al.  ET-GRU: using multi-layer gated recurrent units to identify electron transport proteins , 2019, BMC Bioinformatics.

[33]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[34]  Toshiharu Suzuki,et al.  Trafficking of Alzheimer's disease-related membrane proteins and its participation in disease pathogenesis. , 2006, Journal of biochemistry.

[35]  Dirk Fasshauer,et al.  SNAREing the basis of multicellularity: consequences of protein family expansion during evolution. , 2008, Molecular biology and evolution.

[36]  M. Andersson,et al.  A chloroplast-localized vesicular transport system: a bio-informatics approach , 2004, BMC Genomics.

[37]  S. Munro,et al.  An elaborate classification of SNARE proteins sheds light on the conservation of the eukaryotic endomembrane system. , 2007, Molecular biology of the cell.

[38]  Yu-Yen Ou,et al.  Incorporating deep learning with convolutional neural networks and position specific scoring matrices for identifying electron transport proteins , 2017, J. Comput. Chem..

[39]  K. Chou Pseudo Amino Acid Composition and its Applications in Bioinformatics, Proteomics and System Biology , 2009 .

[40]  T. Nilsson,et al.  Golgi and related vesicle proteomics: simplify to identify. , 2011, Cold Spring Harbor perspectives in biology.

[41]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, ICDM.

[42]  Yu-Yen Ou,et al.  Classifying the molecular functions of Rab GTPases in membrane trafficking using deep convolutional neural networks. , 2018, Analytical biochemistry.

[43]  Rong-Fong Shen,et al.  Large Scale Protein Identification in Intracellular Aquaporin-2 Vesicles from Renal Inner Medullary Collecting Duct*S , 2005, Molecular & Cellular Proteomics.

[44]  Shaogang Gong,et al.  Imbalanced Deep Learning by Minority Class Incremental Rectification , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Tapio Salakoski,et al.  An expanded evaluation of protein function prediction methods shows an improvement in accuracy , 2016, Genome Biology.

[46]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[47]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, Sixth International Conference on Data Mining (ICDM'06).

[48]  Ran Su,et al.  M6APred-EL: A Sequence-Based Predictor for Identifying N6-methyladenosine Sites Using Ensemble Learning , 2018, Molecular therapy. Nucleic acids.

[49]  Jijun Tang,et al.  Prediction of human protein subcellular localization using deep learning , 2017, J. Parallel Distributed Comput..

[50]  J. Rothman,et al.  Dissection of a single round of vesicular transport: Sequential intermediates for intercisternal movement in the Golgi stack , 1989, Cell.

[51]  Milton H. Saier,et al.  TCDB: the Transporter Classification Database for membrane transport protein analyses and information , 2005, Nucleic Acids Res..

[52]  Yu-Yen Ou,et al.  Incorporating post translational modification information for enhancing the predictive performance of membrane transport proteins , 2018, Comput. Biol. Chem..

[53]  Paul A. Bates,et al.  Predicting improved protein conformations with a temporal deep recurrent neural network , 2018 .

[54]  Qin Ma,et al.  UbiSitePred: A novel method for improving the accuracy of ubiquitination sites prediction by using LASSO to select the optimal Chou's pseudo components , 2019, Chemometrics and Intelligent Laboratory Systems.