iIM-CNN: Intelligent Identifier of 6mA Sites on Different Species by Using Convolution Neural Network

DNA N6-methyladenine (6mA) is related to a vast range of biological progress like transcription, replication, and repair of DNA. The precise discrimination of the 6mA sites plays a vital role in the understanding of its biological functions. Even though biochemical experiments produced positive results, they were inefficient in terms of cost and time. Therefore, to facilitate the identification of 6mA sites it is important to develop a robust computational model. In this regard, we develop a deep learning-based computational model named as iIM-CNN for the identification of N6-methyladenine sites from DNA sequences. The iIM-CNN is capable of extracting important features using a convolution neural network (CNN). The proposed model achieves the Mathew correlation coefficient (MCC) of 0.651, 0.752 and 0.941 for cross-species, Rice, and M. musculus genome respectively. The comparison of the outcomes depicts that the proposed model outperforms the existing computational tools for the prediction of the 6mA sites. Finally, a publically available user-friendly web server is available at https://home.jbnu.ac.kr/NSCL/iIMCNN.htm

[1]  Tyson A. Clark,et al.  Direct detection of DNA methylation during single-molecule, real-time sequencing , 2010, Nature Methods.

[2]  L. Doré,et al.  N 6-Methyldeoxyadenosine Marks Active Transcription Start Sites in Chlamydomonas , 2015, Cell.

[3]  Kil To Chong,et al.  Identification of prokaryotic promoters and their strength by integrating heterogeneous features. , 2020, Genomics.

[4]  Kil To Chong,et al.  Deep Splicing Code: Classifying Alternative Splicing Events Using Deep Learning , 2019, Genes.

[5]  E. Greer,et al.  N6-Methyladenine: A Conserved and Dynamic DNA Mark. , 2016, Advances in experimental medicine and biology.

[6]  Hilal Tayara,et al.  iPseU-CNN: Identifying RNA Pseudouridine Sites Using Convolutional Neural Networks , 2019, Molecular therapy. Nucleic acids.

[7]  N. Kleckner,et al.  E. coli oriC and the dnaA gene promoter are sequestered from dam methyltransferase following the passage of the chromosomal replication fork , 1990, Cell.

[8]  Kil To Chong,et al.  iDNA6mA (5-step rule): Identification of DNA N6-methyladenine sites in the rice genome by intelligent computational model via Chou's 5-step rule , 2019, Chemometrics and Intelligent Laboratory Systems.

[9]  Kil To Chong,et al.  Branch Point Selection in RNA Splicing Using Deep Learning , 2019, IEEE Access.

[10]  Kil To Chong,et al.  4mCCNN: Identification of N4-Methylcytosine Sites in Prokaryotes Using Convolutional Neural Network , 2019, IEEE Access.

[11]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[12]  K. Chou,et al.  iDNA6mA-PseKNC: Identifying DNA N6-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. , 2018, Genomics.

[13]  Junchi Yan,et al.  Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks , 2017, BMC Genomics.

[14]  L. Aravind,et al.  DNA Methylation on N6-Adenine in C. elegans , 2015, Cell.

[15]  M. Marinus,et al.  Analysis of Global Gene Expression and Double-Strand-Break Formation in DNA Adenine Methyltransferase- and Mismatch Repair-Deficient Escherichia coli , 2005, Journal of bacteriology.

[16]  Kil To Chong,et al.  Object Detection in Very High-Resolution Aerial Images Using One-Stage Densely Connected Feature Pyramid Network , 2018, Sensors.

[17]  Charles R. Bradshaw,et al.  Identification of methylated deoxyadenosines in vertebrates reveals diversity in DNA modifications , 2015, Nature Structural &Molecular Biology.

[18]  Ren Long,et al.  iRSpot-EL: identify recombination spots with an ensemble learning approach , 2017, Bioinform..

[19]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[20]  Wei Dong,et al.  csDMA: an improved bioinformatics tool for identifying DNA 6 mA modifications via Chou’s 5-step rule , 2019, Scientific Reports.

[21]  Yu Zhao,et al.  Identification and analysis of adenine N6-methylation sites in the rice genome , 2018, Nature Plants.

[22]  Wei Chen,et al.  i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome , 2019, Bioinform..

[23]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[24]  Khaled Shaalan,et al.  Speech Recognition Using Deep Neural Networks: A Systematic Review , 2019, IEEE Access.

[25]  A. Bird,et al.  Use of restriction enzymes to study eukaryotic DNA methylation: II. The symmetry of methylated sites supports semi-conservative copying of the methylation pattern. , 1978, Journal of molecular biology.

[26]  M. Meselson,et al.  Effects of high levels of DNA adenine methylation on methyl-directed mismatch repair in Escherichia coli. , 1983, Genetics.

[27]  Robert J. Schmitz,et al.  Widespread adenine N6-methylation of active genes in fungi , 2017, Nature Genetics.

[28]  Chuan He,et al.  DNA N6-methyladenine in metazoans: functional epigenetic mark or bystander? , 2017, Nature Structural &Molecular Biology.

[29]  Kristina M Smith,et al.  Genome-wide high throughput analysis of DNA methylation in eukaryotes. , 2009, Methods.

[30]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[31]  Kil To Chong,et al.  iRNA-PseKNC(2methyl): Identify RNA 2'-O-methylation sites by convolution neural network and Chou's pseudo components. , 2019, Journal of theoretical biology.

[32]  Kil To Chong,et al.  Vehicle Detection and Counting in High-Resolution Aerial Images Using Convolutional Regression Neural Network , 2018, IEEE Access.

[33]  Kil To Chong,et al.  iN6-Methyl (5-step): Identifying RNA N6-methyladenosine sites using deep learning mode via Chou's 5-step rules and Chou's general PseKNC , 2019, Chemometrics and Intelligent Laboratory Systems.

[34]  K. Chou,et al.  Recent progress in protein subcellular location prediction. , 2007, Analytical biochemistry.

[35]  A. Krais,et al.  Genomic N6‐methyladenine determination by MEKC with LIF , 2010, Electrophoresis.

[36]  D. B. Dunn,et al.  Occurrence of a New Base in the Deoxyribonucleic Acid of a Strain of Bacterium Coli , 1955, Nature.

[37]  Kil To Chong,et al.  DeePromoter: Robust Promoter Predictor Using Deep Learning , 2019, Front. Genet..

[38]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.