DeepRibSt: a multi-feature convolutional neural network for predicting ribosome stalling

Ribosomes are a kind of organelle in cells, which are mainly involved in the translation process of genetic materials, but the underlying mechanisms associated with ribosome stalling are not fully understood. Thanks to the development of biological experimental techniques, many ribosome footprintings are generated, which can help us to study ribosome stalling. Effectively obtaining a precise ribosome stalling site will be helpful for the treatment of the related diseases, however there is still much room for the improvement of ribosome stalling prediction. In this study, we propose a new deep neural network model named DeepRibSt for the prediction of ribosome stalling sites. We first process the ribosome footprinting data to the training set. Then three new features, including evolutionary conservation, hydrophobicity, and amino dissociation constant, along with the previous sequence features, are extracted as input to the network. To improve the performance of the algorithm in ribosome stalling prediction, we use two convolutional layers and three fully connected layers to design a new network architecture. To verify the validity of our proposed DeepRibSt, we compare DeepRibSt with four popular deep neural networks, i.e., AlexNet, LeNet, ResNet, and LSTM on human (i.e., Battle2015 and Stumpf13) and yeast (i.e., Pop2014, Young15, and Brar12) data. To further demonstrate the effectiveness of DeepRibS, we compare DeepRibSt with the state-of-the-art method. The experimental results show that DeepRibSt outperforms all other methods and achieves the state-of-the-art performance in accuracy, recall, specificity, F1-score, and the area under the receiver operating characteristic curve (AUC).

[1]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[2]  Peter J Houghton,et al.  Lost in translation: dysregulation of cap-dependent translation and cancer. , 2004, Cancer cell.

[3]  Desmond G. Higgins,et al.  GWIPS-viz: development of a ribo-seq genome browser , 2013, Nucleic Acids Res..

[4]  A. Borreca,et al.  Opposite Dysregulation of Fragile-X Mental Retardation Protein and Heteronuclear Ribonucleoprotein C Protein Associates with Enhanced APP Translation in Alzheimer Disease , 2015, Molecular Neurobiology.

[5]  Yan Wang,et al.  RPFdb: a database for genome wide information of translated mRNA generated from ribosome profiling , 2015, Nucleic Acids Res..

[6]  Jianyang Zeng,et al.  A deep learning framework for modeling structural features of RNA-binding protein targets , 2015, Nucleic acids research.

[7]  Fan Zhang,et al.  Rli1/ABCE1 Recycles Terminating Ribosomes and Controls Translation Reinitiation in 3′UTRs In Vivo , 2015, Cell.

[8]  Yasha Hasija,et al.  Predicting MicroRNA Sequence Using CNN and LSTM Stacked in Seq2Seq Architecture , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[9]  Zhimin Zhang,et al.  DeepMirTar: a deep‐learning approach for predicting human miRNA targets , 2018, Bioinform..

[10]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[11]  Qin Lu,et al.  CNNsite: Prediction of DNA-binding residues in proteins using Convolutional Neural Network with sequence features , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[12]  Debbie C. Mulhearn,et al.  Role of antibiotic ligand in nascent peptide-dependent ribosome stalling , 2011, Proceedings of the National Academy of Sciences.

[13]  A. Giraldez,et al.  Ribosome Profiling Shows That miR-430 Reduces Translation Before Causing mRNA Decay in Zebrafish , 2012, Science.

[14]  Eric T. Wang,et al.  Dysregulation of mRNA Localization and Translation in Genetic Disease , 2016, The Journal of Neuroscience.

[15]  Albert Pla,et al.  miRAW: A deep learning-based approach to predict microRNA targets by analyzing whole microRNA transcripts , 2018, PLoS Comput. Biol..

[16]  Nicholas T. Ingolia Ribosome Footprint Profiling of Translation throughout the Genome , 2016, Cell.

[17]  Zhi-Hua Zhou,et al.  Abductive learning: towards bridging machine learning and logical reasoning , 2019, Science China Information Sciences.

[18]  Alexis Battle,et al.  Impact of regulatory variation from RNA to protein , 2015, Science.

[19]  Stephen H. White,et al.  Experimentally determined hydrophobicity scale for proteins at membrane interfaces , 1996, Nature Structural Biology.

[20]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[21]  Daphne Koller,et al.  Causal signals between codon bias, mRNA structure, and the efficiency of translation and elongation , 2014, Molecular systems biology.

[22]  K. Morris,et al.  Evolutionary conservation of long non-coding RNAs; sequence, structure, function. , 2014, Biochimica et biophysica acta.

[23]  Timo Aila,et al.  Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.

[24]  Audrey M. Michel,et al.  Observation of dually decoded regions of the human genome using ribosome profiling data , 2012, Genome research.

[25]  Shih-Hau Fang,et al.  Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach. , 2019, Journal of voice : official journal of the Voice Foundation.

[26]  R. Beckmann,et al.  Molecular basis for the ribosome functioning as an L-tryptophan sensor. , 2014, Cell reports.

[27]  Ruth Nussinov,et al.  Synonymous mutations and ribosome stalling can lead to altered folding pathways and distinct minima. , 2008, Journal of molecular biology.

[28]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[29]  D. Söll,et al.  Codon Bias as a Means to Fine-Tune Gene Expression. , 2015, Molecular cell.

[30]  C. J. Woolstenhulme,et al.  Genetic Identification of Nascent Peptides That Induce Ribosome Stalling* , 2009, The Journal of Biological Chemistry.

[31]  Tal Pupko,et al.  ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids , 2010, Nucleic Acids Res..

[32]  A. Mankin,et al.  Molecular mechanism of drug-dependent ribosome stalling. , 2008, Molecular cell.

[33]  David K. Gifford,et al.  Convolutional neural network architectures for predicting DNA–protein binding , 2016, Bioinform..

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Kai Hu,et al.  Automatic segmentation of retinal layer boundaries in OCT images using multiscale convolutional neural network and graph search , 2019, Neurocomputing.

[36]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[37]  Nicholas T. Ingolia,et al.  High-Resolution View of the Yeast Meiotic Program Revealed by Ribosome Profiling , 2011, Science.

[38]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[39]  Adam B. Olshen,et al.  The translational landscape of the mammalian cell cycle. , 2013, Molecular cell.

[40]  C. Kimchi-Sarfaty,et al.  Understanding the contribution of synonymous mutations to human disease , 2011, Nature Reviews Genetics.

[41]  Yoshua Bengio,et al.  Global training of document processing systems using graph transformer networks , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  J. Weissman,et al.  Ribosome profiling reveals the what, when, where and how of protein synthesis , 2015, Nature Reviews Molecular Cell Biology.

[43]  Julie L. Chaney,et al.  Roles for Synonymous Codon Usage in Protein Biogenesis. , 2015, Annual review of biophysics.

[44]  Tiangang Zhang,et al.  Dual Convolutional Neural Network Based Method for Predicting Disease-Related miRNAs , 2018, International journal of molecular sciences.

[45]  Yijie Wang,et al.  Ultra-Accurate Complex Disorder Prediction: Case Study of Neurodevelopmental Disorders , 2017, RECOMB.

[46]  Yi Yang,et al.  Semisupervised Feature Selection via Spline Regression for Video Semantic Recognition , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[47]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[48]  Nicholas T. Ingolia,et al.  Ribosome Profiling of Mouse Embryonic Stem Cells Reveals the Complexity and Dynamics of Mammalian Proteomes , 2011, Cell.

[49]  Nicholas T. Ingolia,et al.  Genome-Wide Analysis in Vivo of Translation with Nucleotide Resolution Using Ribosome Profiling , 2009, Science.

[50]  Vijay S. Pande,et al.  Non-Bulk-Like Solvent Behavior in the Ribosome Exit Tunnel , 2010, PLoS Comput. Biol..

[51]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[52]  Hong-Bin Shen,et al.  RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach , 2016, BMC Bioinformatics.

[53]  B. Brodie,et al.  THE IMPORTANCE OF DISSOCIATION CONSTANT AND LIPID-SOLUBILITY IN INFLUENCING THE PASSAGE OF DRUGS INTO THE CEREBROSPINAL FLUID , 1960 .

[54]  K. Pollard,et al.  Detection of nonneutral substitution rates on mammalian phylogenies. , 2010, Genome research.