DeepMP: a deep learning tool to detect DNA base modifications on Nanopore sequencing data

Motivation DNA Methylation plays a key role in a variety of biological processes. Recently, Nanopore long-read sequencing has enabled direct detection of these modifications. As a consequence, a range of computational methods have been developed to exploit Nanopore data for methylation detection. However, current approaches rely on a human-defined threshold to detect the methylation status of a genomic position and are not optimized to detect sites methylated at low frequency. Furthermore, most methods employ either the Nanopore signals or the basecalling errors as the model input and do not take advantage of their combination. Results Here we present DeepMP, a convolutional neural network (CNN)-based model that takes information from Nanopore signals and basecalling errors to detect whether a given motif in a read is methylated or not. Besides, DeepMP introduces a threshold-free position modification calling model sensitive to sites methylated at low frequency across cells. We comprehensively benchmarked DeepMP against state-of-the-art methods on E. coli, human and pUC19 datasets. DeepMP outperforms current approaches at read-based and position-based methylation detection across sites methylated at different frequencies in the three datasets. Availability DeepMP is implemented and freely available under MIT license at github.com/pepebonet/DeepMP Contact jose.bonet@irbbarcelona.org — mandiche@kth.se

[1]  Dennis McNevin,et al.  Systematic benchmarking of tools for CpG methylation detection from nanopore sequencing , 2020, Nature Communications.

[2]  Schraga Schwartz,et al.  Deciphering the “m6A Code” via Antibody-Independent Quantitative Profiling , 2019, Cell.

[3]  Li Fang,et al.  Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data , 2019, Nature Communications.

[4]  Christian Bates Accurate detection of m6A RNA modifications in native RNA sequences , 2019 .

[5]  Heike Sichtig,et al.  Single-molecule sequencing detection of N6-methyladenine in microbial reference materials , 2019, Nature Communications.

[6]  Feng Luo,et al.  DeepSignal: detecting DNA methylation state from Nanopore sequencing reads using deep-learning , 2018, bioRxiv.

[7]  Gintaras Deikus,et al.  Mapping and characterizing N6-methyladenine in eukaryotic genomes using single-molecule real-time sequencing , 2018, Genome research.

[8]  Heng Li,et al.  Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[9]  Brent S. Pedersen,et al.  Nanopore sequencing and assembly of a human genome with ultra-long reads , 2017, Nature Biotechnology.

[10]  Ji Eun Lee,et al.  De novo Identification of DNA Modifications Enabled by Genome-Guided Nanopore Signal Processing , 2017, bioRxiv.

[11]  Michael C Schatz,et al.  Nanopore sequencing meets epigenetics , 2017, Nature Methods.

[12]  Winston Timp,et al.  Detecting DNA cytosine methylation using nanopore sequencing , 2017, Nature Methods.

[13]  Jordan M. Eizenga,et al.  Mapping DNA Methylation with High Throughput Nanopore Sequencing , 2017, Nature Methods.

[14]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[15]  D. Schübeler Function and information content of DNA methylation , 2015, Nature.

[16]  M. Akeson,et al.  Nanopores Discriminate among Five C5-Cytosine Variants in DNA , 2014, Journal of the American Chemical Society.

[17]  Mark Akeson,et al.  Error rates for nanopore discrimination among cytosine, methylcytosine, and hydroxymethylcytosine along individual DNA strands , 2013, Proceedings of the National Academy of Sciences.

[18]  I. Derrington,et al.  Detection and mapping of 5-methylcytosine and 5-hydroxymethylcytosine with nanopore MspA , 2013, Proceedings of the National Academy of Sciences.

[19]  Matthew K Waldor,et al.  Entering the era of bacterial epigenomics with single molecule real time DNA sequencing. , 2013, Current opinion in microbiology.

[20]  Howard Cedar,et al.  DNA methylation dynamics in health and disease , 2013, Nature Structural &Molecular Biology.

[21]  Raymond K. Auerbach,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[22]  F. Miura,et al.  Amplification-free whole-genome bisulfite sequencing by post-bisulfite adaptor tagging , 2012, Nucleic acids research.

[23]  Peter A. Jones Functions of DNA methylation: islands, start sites, gene bodies and beyond , 2012, Nature Reviews Genetics.

[24]  M. Esteller Epigenetic changes in cancer , 2011, F1000 biology reports.

[25]  S. Gonzalo,et al.  Epigenetic alterations in aging. , 2010, Journal of applied physiology.

[26]  Tyson A. Clark,et al.  Direct detection of DNA methylation during single-molecule, real-time sequencing , 2010, Nature Methods.

[27]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[28]  Esteban Ballestar,et al.  DNA Methylation Polymorphisms Precede Any Histological Sign of Atherosclerosis in Mice Lacking Apolipoprotein E* , 2004, Journal of Biological Chemistry.

[29]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .