Systematic benchmarking of tools for CpG methylation detection from nanopore sequencing

DNA methylation plays a fundamental role in the control of gene expression and genome integrity. Although there are multiple tools that enable its detection from Nanopore sequencing, their accuracy remains largely unknown. Here, we present a systematic benchmarking of tools for the detection of CpG methylation from Nanopore sequencing using individual reads, control mixtures of methylated and unmethylated reads, and bisulfite sequencing. We found that tools have a tradeoff between false positives and false negatives, and present a high dispersion with respect to the expected methylation frequency values. We described various strategies to improve the accuracy of these tools, including a new consensus approach, METEORE (https://github.com/comprna/METEORE), based on the combination of the predictions from two or more tools that shows improved accuracy over individual tools. Snakemake pipelines are provided for reproducibility and to enable the systematic application of our analyses to other datasets.

[1]  S. Sur,et al.  A new method for accurate assessment of DNA quality after bisulfite treatment , 2007, Nucleic Acids Research.

[2]  M. Pellegrini,et al.  A comparative analysis of DNA methylation across human embryonic stem cell lines , 2011, Genome Biology.

[3]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[4]  I. Derrington,et al.  Detection and mapping of 5-methylcytosine and 5-hydroxymethylcytosine with nanopore MspA , 2013, Proceedings of the National Academy of Sciences.

[5]  S. Sukumar,et al.  Targeted nanopore sequencing with Cas9-guided adapter ligation , 2020, Nature Biotechnology.

[6]  Feng Luo,et al.  DeepSignal: detecting DNA methylation state from Nanopore sequencing reads using deep-learning , 2018, bioRxiv.

[7]  Sven Rahmann,et al.  Genome analysis , 2022 .

[8]  Ji Eun Lee,et al.  De novo Identification of DNA Modifications Enabled by Genome-Guided Nanopore Signal Processing , 2017, bioRxiv.

[9]  Winston Timp,et al.  Detecting DNA cytosine methylation using nanopore sequencing , 2017, Nature Methods.

[10]  Farzeen Kader,et al.  DNA methylation and application in forensic sciences. , 2015, Forensic science international.

[11]  Shankar Balasubramanian,et al.  Mapping and elucidating the function of modified bases in DNA , 2017 .

[12]  D. Bourc’his,et al.  The diverse roles of DNA methylation in mammalian development and disease , 2019, Nature Reviews Molecular Cell Biology.

[13]  Pao-Yang Chen,et al.  Profiling genome-wide DNA methylation , 2016, Epigenetics & Chromatin.

[14]  D. Egli,et al.  NanoMod: a computational tool to detect DNA modifications using Nanopore long-read sequencing data , 2019, BMC Genomics.

[15]  Peter A. Jones Functions of DNA methylation: islands, start sites, gene bodies and beyond , 2012, Nature Reviews Genetics.

[16]  Heng Li,et al.  Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[17]  Li Fang,et al.  Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data , 2019, Nature Communications.

[18]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[19]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[20]  Jordan M. Eizenga,et al.  Mapping DNA Methylation with High Throughput Nanopore Sequencing , 2017, Nature Methods.

[21]  Kornel Labun,et al.  CHOPCHOP v3: expanding the CRISPR web toolbox beyond genome editing , 2019, Nucleic Acids Res..

[22]  Heike Sichtig,et al.  Single-molecule sequencing detection of N6-methyladenine in microbial reference materials , 2019, Nature Communications.

[23]  Christoph Grunau,et al.  Bisulfite genomic sequencing: systematic investigation of critical experimental parameters , 2001, Nucleic Acids Res..