BackgroundDNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions that are differentially methylated between two or more biological conditions. Even though a variety of software packages is available for different aspects of the bioinformatics analysis, they often produce results that are biased or require excessive computational requirements.ResultsDMRfinder is a novel computational pipeline that identifies differentially methylated regions efficiently. Following alignment, DMRfinder extracts methylation counts and performs a modified single-linkage clustering of methylation sites into genomic regions. It then compares methylation levels using beta-binomial hierarchical modeling and Wald tests. Among its innovative attributes are the analyses of novel methylation sites and methylation linkage, as well as the simultaneous statistical analysis of multiple sample groups. To demonstrate its efficiency, DMRfinder is benchmarked against other computational approaches using a large published dataset. Contrasting two replicates of the same sample yielded minimal genomic regions with DMRfinder, whereas two alternative software packages reported a substantial number of false positives. Further analyses of biological samples revealed fundamental differences between DMRfinder and another software package, despite the fact that they utilize the same underlying statistical basis. For each step, DMRfinder completed the analysis in a fraction of the time required by other software.ConclusionsAmong the computational approaches for identifying differentially methylated regions from high-throughput bisulfite sequencing datasets, DMRfinder is the first that integrates all the post-alignment steps in a single package. Compared to other software, DMRfinder is extremely efficient and unbiased in this process. DMRfinder is free and open-source software, available on GitHub (github.com/jsh58/DMRfinder); it is written in Python and R, and is supported on Linux.
[1]
Robert S Illingworth,et al.
CpG islands – ‘A rough guide’
,
2009,
FEBS letters.
[2]
J. Maguire,et al.
Solution Hybrid Selection with Ultra-long Oligonucleotides for Massively Parallel Targeted Sequencing
,
2009,
Nature Biotechnology.
[3]
Martin Dugas,et al.
Detection of significantly differentially methylated regions in targeted bisulfite sequencing data
,
2013,
Bioinform..
[4]
Felix Krueger,et al.
Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications
,
2011,
Bioinform..
[5]
Tom H. Pringle,et al.
The human genome browser at UCSC.
,
2002,
Genome research.
[6]
Yuan Cheng,et al.
Dynamic CCAAT/Enhancer Binding Protein–Associated Changes of DNA Methylation in the Angiotensinogen Gene
,
2014,
Hypertension.
[7]
K. Conneely,et al.
A Bayesian hierarchical model to detect differentially methylated loci from single nucleotide resolution sequencing data
,
2014,
Nucleic acids research.
[8]
Peter A. Jones.
Functions of DNA methylation: islands, start sites, gene bodies and beyond
,
2012,
Nature Reviews Genetics.
[9]
A. Gnirke,et al.
Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis
,
2005,
Nucleic acids research.
[10]
B. Langmead,et al.
BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions
,
2012,
Genome Biology.
[11]
Jonathan Pevsner,et al.
Somatic Mosaicism in the Human Genome
,
2014,
Genes.
[12]
Lee E. Edsall,et al.
Human DNA methylomes at base resolution show widespread epigenomic differences
,
2009,
Nature.
[13]
Hao Wu,et al.
Differential methylation analysis for BS-seq data under general experimental design
,
2016,
Bioinform..