Assessing genome-wide significance for the detection of differentially methylated regions

Abstract DNA methylation plays an important role in human health and disease, and methods for the identification of differently methylated regions are of increasing interest. There is currently a lack of statistical methods which properly address multiple testing, i.e. control genome-wide significance for differentially methylated regions. We introduce a scan statistic (DMRScan), which overcomes these limitations. We benchmark DMRScan against two well established methods (bumphunter, DMRcate), using a simulation study based on real methylation data. An implementation of DMRScan is available from Bioconductor. Our method has higher power than alternative methods across different simulation scenarios, particularly for small effect sizes. DMRScan exhibits greater flexibility in statistical modeling and can be used with more complex designs than current methods. DMRScan is the first dynamic approach which properly addresses the multiple-testing challenges for the identification of differently methylated regions. DMRScan outperformed alternative methods in terms of power, while keeping the false discovery rate controlled.

[1]  Rafael A. Irizarry,et al.  Selection Corrected Statistical Inference for Region Detection with High-throughput Assays , 2016 .

[2]  Peter L Molloy,et al.  De novo identification of differentially methylated regions in the human genome , 2015, Epigenetics & Chromatin.

[3]  Rafael A Irizarry,et al.  Detection and accurate False Discovery Rate control of differentially methylated regions from Whole Genome Bisulfite Sequencing , 2017, bioRxiv.

[4]  Peter A. Jones Functions of DNA methylation: islands, start sites, gene bodies and beyond , 2012, Nature Reviews Genetics.

[5]  Aaron T. L. Lun,et al.  csaw: a Bioconductor package for differential binding analysis of ChIP-seq data using sliding windows , 2015, Nucleic acids research.

[6]  C. Bock Analysing and interpreting DNA methylation data , 2012, Nature Reviews Genetics.

[7]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[8]  Chia-Lin Wei,et al.  Dynamic changes in the human methylome during differentiation. , 2010, Genome research.

[9]  E. Weiderpass,et al.  Genome-wide DNA methylation in saliva and body size of adolescent girls. , 2016, Epigenomics.

[10]  B. Langmead,et al.  BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions , 2012, Genome Biology.

[11]  Jeffrey T Leek,et al.  Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies. , 2012, International journal of epidemiology.

[12]  Martin J. Aryee,et al.  Personalized Epigenomic Signatures That Are Stable Over Time and Covary with Body Mass Index , 2010, Science Translational Medicine.

[13]  Xiao Zhang,et al.  Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis , 2010, BMC Bioinformatics.

[14]  Yan V. Sun,et al.  A scan statistic for identifying chromosomal patterns of SNP association , 2006, Genetic epidemiology.

[15]  F. E. Satterthwaite An approximate distribution of estimates of variance components. , 1946, Biometrics.

[16]  Momiao Xiong,et al.  Gene and pathway-based second-wave analysis of genome-wide association studies , 2010, European Journal of Human Genetics.

[17]  Stephan Beck,et al.  Probe Lasso: A novel method to rope in differentially methylated regions with 450K DNA methylation data , 2015, Methods.

[18]  Satterthwaite Fe An approximate distribution of estimates of variance components. , 1946 .

[19]  D. Aldous Probability Approximations via the Poisson Clumping Heuristic , 1988 .

[20]  Yu Zhang,et al.  Poisson approximation for significance in genome-wide ChIP-chip tiling arrays , 2008, Bioinform..

[21]  D. Siegmund Sequential Analysis: Tests and Confidence Intervals , 1985 .

[22]  Shuo-Yen Robert Li,et al.  Detect differentially methylated regions using non-homogeneous hidden Markov model for methylation array data , 2017, Bioinform..

[23]  D. Siegmund,et al.  False discovery rate for scanning statistics , 2011 .

[24]  Ronald W. Davis,et al.  Scan statistics analysis for detection of introns in time-course tiling array data , 2014, Statistical applications in genetics and molecular biology.

[25]  D. Balding,et al.  Epigenome-wide association studies for common human diseases , 2011, Nature Reviews Genetics.

[26]  Eric-Wubbo Lameijer,et al.  Identification and systematic annotation of tissue-specific differentially methylated regions using the Illumina 450k array , 2013, Epigenetics & Chromatin.

[27]  Raymond K. Auerbach,et al.  PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls , 2009, Nature Biotechnology.