Detection of significantly differentially methylated regions in targeted bisulfite sequencing data

MOTIVATION Bisulfite sequencing is currently the gold standard to obtain genome-wide DNA methylation profiles in eukaryotes. In contrast to the rapid development of appropriate pre-processing and alignment software, methods for analyzing the resulting methylation profiles are relatively limited so far. For instance, an appropriate pipeline to detect DNA methylation differences between cancer and control samples is still required. RESULTS We propose an algorithm that detects significantly differentially methylated regions in data obtained by targeted bisulfite sequencing approaches, such as reduced representation bisulfite sequencing. In a first step, this approach tests all target regions for methylation differences by taking spatial dependence into account. A false discovery rate procedure controls the expected proportion of incorrectly rejected regions. In a second step, the significant target regions are trimmed to the actually differentially methylated regions. This hierarchical procedure detects differentially methylated regions with increased power compared with existing methods. AVAILABILITY R/Bioconductor package BiSeq. SUPPLEMENTARY INFORMATION Supplementary Data are available at Bioinformatics online.

[1]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[2]  P. M. Das,et al.  DNA methylation and cancer. , 2004, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[3]  S. Ferrari,et al.  Beta Regression for Modelling Rates and Proportions , 2004 .

[4]  A. Franke,et al.  DNA methylome analysis using short bisulfite sequencing data , 2012, Nature Methods.

[5]  Felix Krueger,et al.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications , 2011, Bioinform..

[6]  K. Robertson,et al.  DNA methylation in development and human disease. , 2008, Mutation research.

[7]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[8]  A. Zeileis,et al.  Beta Regression in R , 2010 .

[9]  Mary Goldman,et al.  The UCSC Cancer Genomics Browser: update 2015 , 2014, Nucleic Acids Res..

[10]  Christoph Bock,et al.  RRBSMAP: a fast, accurate and user-friendly alignment tool for reduced representation bisulfite sequencing , 2012, Bioinform..

[11]  Huanming Yang,et al.  The DNA Methylome of Human Peripheral Blood Mononuclear Cells , 2010, PLoS biology.

[12]  Y. Benjamini,et al.  Adaptive linear step-up procedures that control the false discovery rate , 2006 .

[13]  Dan Wang,et al.  IMA: an R package for high-throughput analysis of Illumina's 450K Infinium methylation data , 2012, Bioinform..

[14]  Pietro Liò,et al.  EpiChIP: gene-by-gene quantification of epigenetic modification levels , 2010, Nucleic acids research.

[15]  Li Yu,et al.  [DNA methylation and cancer]. , 2005, Zhonghua nei ke za zhi.

[16]  J. Berg,et al.  Dnmt3a is essential for hematopoietic stem cell differentiation , 2011, Nature Genetics.

[17]  J. Berg,et al.  Dnmt 3 a is essential for hematopoietic stem cell differentiation , 2013 .

[18]  Christine Steinhoff,et al.  MethVisual - visualization and exploratory statistical analysis of DNA methylation profiles from bisulfite sequencing , 2010, BMC Research Notes.

[19]  Martin Dugas,et al.  DNA methylation changes are a late event in acute promyelocytic leukemia and coincide with loss of transcription factor binding. , 2013, Blood.

[20]  René S. Kahn,et al.  The Relationship of DNA Methylation with Age, Gender and Genotype in Twins and Healthy Controls , 2009, PloS one.

[21]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[22]  Xin Zhou,et al.  A statistical framework for Illumina DNA methylation arrays , 2010, Bioinform..

[23]  Wagner Barreto-Souza,et al.  Improved estimators for a general class of beta regression models , 2008, Comput. Stat. Data Anal..

[24]  Pao-Yang Chen,et al.  BS Seeker: precise mapping for bisulfite sequencing , 2010, BMC Bioinformatics.

[25]  Thomas Lengauer,et al.  BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing , 2011, Nucleic Acids Res..

[26]  Y. Benjamini,et al.  Multiple Hypotheses Testing with Weights , 1997 .

[27]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[28]  K. Gaston,et al.  CpG methylation and the binding of YY1 and ETS proteins to the Surf-1/Surf-2 bidirectional promoter. , 1995, Gene.

[29]  A. Feinberg,et al.  Increased methylation variation in epigenetic domains across cancer types , 2011, Nature Genetics.

[30]  Giorgio Valle,et al.  PASS-bis: a bisulfite aligner suitable for whole methylome analysis of Illumina and SOLiD reads , 2013, Bioinform..

[31]  Zachary D. Smith,et al.  Genome-scale DNA methylation mapping of clinical samples at single-nucleotide resolution , 2010, Nature Methods.

[32]  Saurabh Baheti,et al.  SAAP-RRBS: streamlined analysis and annotation pipeline for reduced representation bisulfite sequencing , 2012, Bioinform..

[33]  J. Rogers,et al.  DNA methylation profiling of human chromosomes 6, 20 and 22 , 2006, Nature Genetics.

[34]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[35]  Y. Benjamini,et al.  False Discovery Rates for Spatial Signals , 2007 .

[36]  B. Langmead,et al.  BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions , 2012, Genome Biology.

[37]  Jeffrey T Leek,et al.  Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies. , 2012, International journal of epidemiology.

[38]  A. Gnirke,et al.  Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis , 2005, Nucleic acids research.

[39]  Hong-Qiang Wang,et al.  SLIM: a sliding linear model for estimating the proportion of true null hypotheses in datasets with dependence structures , 2011, Bioinform..

[40]  Francine E. Garrett-Bakelman,et al.  methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles , 2012, Genome Biology.

[41]  C. Bock Analysing and interpreting DNA methylation data , 2012, Nature Reviews Genetics.