ET-GRU: using multi-layer gated recurrent units to identify electron transport proteins

BackgroundElectron transport chain is a series of protein complexes embedded in the process of cellular respiration, which is an important process to transfer electrons and other macromolecules throughout the cell. It is also the major process to extract energy via redox reactions in the case of oxidation of sugars. Many studies have determined that the electron transport protein has been implicated in a variety of human diseases, i.e. diabetes, Parkinson, Alzheimer’s disease and so on. Few bioinformatics studies have been conducted to identify the electron transport proteins with high accuracy, however, their performance results require a lot of improvements. Here, we present a novel deep neural network architecture to address this problem.ResultsMost of the previous studies could not use the original position specific scoring matrix (PSSM) profiles to feed into neural networks, leading to a lack of information and the neural networks consequently could not achieve the best results. In this paper, we present a novel approach by using deep gated recurrent units (GRU) on full PSSMs to resolve this problem. Our approach can precisely predict the electron transporters with the cross-validation and independent test accuracy of 93.5 and 92.3%, respectively. Our approach demonstrates superior performance to all of the state-of-the-art predictors on electron transport proteins.ConclusionsThrough the proposed study, we provide ET-GRU, a web server for discriminating electron transport proteins in particular and other protein functions in general. Also, our achievement could promote the use of GRU in computational biology, especially in protein function prediction.

[1]  W. Parker,et al.  Cytochrome Oxidase Deficiency in Alzheimer's Disease a , 1991, Annals of the New York Academy of Sciences.

[2]  E. Verdin,et al.  Mitochondrial acetylome analysis in a mouse model of alcohol-induced liver injury utilizing SIRT3 knockout mice. , 2012, Journal of proteome research.

[3]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[4]  Ole Winther,et al.  Deep Recurrent Conditional Random Field Network for Protein Secondary Prediction , 2017, BCB.

[5]  Gary Fiskum,et al.  Generation of reactive oxygen species by the mitochondrial electron transport chain , 2002, Journal of neurochemistry.

[6]  Yu-Yen Ou,et al.  Protein disorder prediction by condensed PSSM considering propensity for order or disorder , 2006, BMC Bioinformatics.

[7]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[8]  Tuan-Tu Huynh,et al.  Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles , 2019, Comput. Methods Programs Biomed..

[9]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[10]  P. Mullineaux,et al.  Oxygen Metabolism and the Regulation of Photosynthetic Electron Transport , 2019, Causes of Photooxidative Stress and Amelioration of Defense Systems in Plants.

[11]  Yu-Yen Ou,et al.  Incorporating efficient radial basis function networks and significant amino acid pairs for predicting GTP binding sites in transport proteins , 2016, BMC Bioinformatics.

[12]  Christine H. Foyer,et al.  Causes of Photooxidative Stress and Amelioration of Defense Systems in Plants , 1993 .

[13]  Milton H. Saier,et al.  TCDB: the Transporter Classification Database for membrane transport protein analyses and information , 2005, Nucleic Acids Res..

[14]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001 .

[15]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[16]  Yu-Yen Ou,et al.  Classifying the molecular functions of Rab GTPases in membrane trafficking using deep convolutional neural networks. , 2018, Analytical biochemistry.

[17]  Ziding Zhang,et al.  TIM-Finder: A new method for identifying TIM-barrel proteins , 2009, BMC Structural Biology.

[18]  Daniel Quang,et al.  DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences , 2015 .

[19]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[20]  Padideh Danaee,et al.  A deep recurrent neural network discovers complex biological rules to decipher RNA protein-coding potential , 2017, bioRxiv.

[21]  Lingyun Zou,et al.  Accurate prediction of bacterial type IV secreted effectors using amino acid composition and PSSM profiles , 2013, Bioinform..

[22]  Yu-Yen Ou,et al.  Incorporating deep learning with convolutional neural networks and position specific scoring matrices for identifying electron transport proteins , 2017, J. Comput. Chem..

[23]  Nikhil Ketkar,et al.  Introduction to PyTorch , 2021, Deep Learning with Python.

[24]  Minoru Kanehisa,et al.  AAindex: Amino Acid index database , 2000, Nucleic Acids Res..

[25]  Zhen Chen,et al.  Outer membrane proteins can be simply identified using secondary structure element alignment , 2011, BMC Bioinformatics.

[26]  Yu-Yen Ou,et al.  Prediction of FAD binding sites in electron transport proteins according to efficient radial basis function networks and significant amino acid pairs , 2016, BMC Bioinformatics.

[27]  Bernd Thiede,et al.  Sorafenib-induced mitochondrial complex I inactivation and cell death in human neuroblastoma cells. , 2012, Journal of proteome research.

[28]  Yingbo Li,et al.  Detection and tracking of overlapping cell nuclei for large scale mitosis analyses , 2016, BMC Bioinformatics.

[29]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[30]  Yu-Yen Ou,et al.  DeepEfflux: a 2D convolutional neural network model for identifying families of efflux proteins in transporters , 2018, Bioinform..

[31]  J. Parks,et al.  Abnormalities of the electron transport chain in idiopathic parkinson's disease , 1989, Annals of neurology.

[32]  Patrick X. Zhao,et al.  Prediction of Membrane Transport Proteins and Their Substrate Specificities Using Primary Sequence Information , 2014, PloS one.

[33]  M. Michael Gromiha,et al.  Functional discrimination of membrane proteins using machine learning techniques , 2008, BMC Bioinformatics.

[34]  Yu-Yen Ou,et al.  Identifying the molecular functions of electron transport proteins using radial basis function networks and biochemical properties. , 2017, Journal of molecular graphics & modelling.

[35]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[36]  Yu-Yen Ou,et al.  iEnhancer-5Step: Identifying enhancers using hidden information of DNA sequences via Chou's 5-step rule and word embedding. , 2019, Analytical biochemistry.

[37]  Yu-Yen Ou,et al.  iMotor-CNN: Identifying molecular functions of cytoskeleton motor proteins using 2D convolutional neural network via Chou's 5-step rule. , 2019, Analytical biochemistry.

[38]  E. C. Slater THE RESPIRATORY CHAIN AND OXIDATIVE PHOSPHORYLATION , 1972 .

[39]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[40]  Yu-Yen Ou,et al.  Prediction of transporter targets using efficient RBF networks with PSSM profiles and biochemical properties , 2011, Bioinform..

[41]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[42]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[43]  Jianlin Cheng,et al.  A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.