DeepM6ASeq: prediction and characterization of m6A-containing sequences using deep learning

BackgroundN6-methyladensine (m6A) is a common and abundant RNA methylation modification found in various species. As a type of post-transcriptional methylation, m6A plays an important role in diverse RNA activities such as alternative splicing, an interplay with microRNAs and translation efficiency. Although existing tools can predict m6A at single-base resolution, it is still challenging to extract the biological information surrounding m6A sites.ResultsWe implemented a deep learning framework, named DeepM6ASeq, to predict m6A-containing sequences and characterize surrounding biological features based on miCLIP-Seq data, which detects m6A sites at single-base resolution. DeepM6ASeq showed better performance as compared to other machine learning classifiers. Moreover, an independent test on m6A-Seq data, which identifies m6A-containing genomic regions, revealed that our model is competitive in predicting m6A-containing sequences. The learned motifs from DeepM6ASeq correspond to known m6A readers. Notably, DeepM6ASeq also identifies a newly recognized m6A reader: FMR1. Besides, we found that a saliency map in the deep learning model could be utilized to visualize locations of m6A sites.ConculsionWe developed a deep-learning-based framework to predict and characterize m6A-containing sequences and hope to help investigators to gain more insights for m6A research. The source code is available at https://github.com/rreybeyb/DeepM6ASeq.

[1]  Chuan He,et al.  Nuclear m(6)A Reader YTHDC1 Regulates mRNA Splicing. , 2016, Trends in genetics : TIG.

[2]  Yue Sheng,et al.  METTL14 Inhibits Hematopoietic Stem/Progenitor Differentiation and Promotes Leukemogenesis via mRNA m6A Modification. , 2017, Cell stem cell.

[3]  M. Kupiec,et al.  Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq , 2012, Nature.

[4]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[5]  Chuan He,et al.  N 6 -methyladenosine Modulates Messenger RNA Translation Efficiency , 2015, Cell.

[6]  Xing Chen,et al.  MeT-DB V2.0: elucidating context-specific functions of N6-methyl-adenosine methyltranscriptome , 2017, Nucleic Acids Res..

[7]  Xiaohui S. Xie,et al.  DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences , 2015, bioRxiv.

[8]  F. Liu,et al.  m6A modulates haematopoietic stem and progenitor cell specification , 2017, Nature.

[9]  Yi Xing,et al.  m(6)A RNA modification controls cell fate transition in mammalian embryonic stem cells. , 2014, Cell stem cell.

[10]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[11]  Christopher E. Mason,et al.  Single-nucleotide resolution mapping of m6A and m6Am throughout the transcriptome , 2015, Nature Methods.

[12]  O. Elemento,et al.  Comprehensive Analysis of mRNA Methylation Reveals Enrichment in 3′ UTRs and near Stop Codons , 2012, Cell.

[13]  J. van Helden,et al.  RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections , 2016, bioRxiv.

[14]  Gideon Rechavi,et al.  Transcriptome-wide mapping of N6-methyladenosine by m6A-seq based on immunocapturing and massively parallel sequencing , 2013, Nature Protocols.

[15]  Tao Pan,et al.  Dynamic RNA Modifications in Gene Expression Regulation , 2017, Cell.

[16]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[17]  Chuan He,et al.  N6-methyladenosine (m6A) recruits and repels proteins to regulate mRNA homeostasis , 2017, Nature Structural &Molecular Biology.

[18]  Ning Chen,et al.  Chromatin accessibility prediction via convolutional long short-term memory networks with k-mer embedding , 2017, Bioinform..

[19]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[20]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[21]  William Stafford Noble,et al.  Quantifying similarity between motifs , 2007, Genome Biology.

[22]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[23]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[24]  Nian Liu,et al.  N 6-methyladenosine alters RNA structure to regulate binding of a low-complexity protein , 2017, Nucleic acids research.

[25]  Schraga Schwartz,et al.  High-Resolution Mapping Reveals a Conserved, Widespread, Dynamic mRNA Methylation Program in Yeast Meiosis , 2013, Cell.

[26]  C. Timpte,et al.  Induction of sporulation in Saccharomyces cerevisiae leads to the formation of N6-methyladenosine in mRNA: a potential mechanism for the activity of the IME4 gene. , 2002, Nucleic acids research.

[27]  Qi Zhou,et al.  m(6)A RNA methylation is regulated by microRNAs and promotes reprogramming to pluripotency. , 2015, Cell stem cell.

[28]  Beilun Wang,et al.  Deep Motif Dashboard: Visualizing and Understanding Genomic Sequences Using Deep Neural Networks , 2016, PSB.

[29]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[30]  K. Chou,et al.  iRNA-Methyl: Identifying N(6)-methyladenosine sites using pseudo nucleotide composition. , 2015, Analytical biochemistry.

[31]  Q. Cui,et al.  SRAMP: prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features , 2016, Nucleic acids research.

[32]  Wei Chen,et al.  Identification and analysis of the N6-methyladenosine in the Saccharomyces cerevisiae transcriptome , 2015, Scientific Reports.

[33]  Arne Klungland,et al.  A majority of m6A residues are in the last exons, allowing the potential for 3′ UTR regulation , 2015, Genes & development.