BayMeth: improved DNA methylation quantification for affinity capture sequencing data using a flexible Bayesian approach

Affinity capture of DNA methylation combined with high-throughput sequencing strikes a good balance between the high cost of whole genome bisulfite sequencing and the low coverage of methylation arrays. We present BayMeth, an empirical Bayes approach that uses a fully methylated control sample to transform observed read counts into regional methylation levels. In our model, inefficient capture can readily be distinguished from low methylation levels. BayMeth improves on existing methods, allows explicit modeling of copy number variation, and offers computationally efficient analytical mean and variance estimators. BayMeth is available in the Repitools Bioconductor package.

[1]  Huidong Shi,et al.  Analyzing the cancer methylome through targeted bisulfite sequencing. , 2013, Cancer letters.

[2]  Gerben Menschaert,et al.  Quality Evaluation of Methyl Binding Domain Based Kits for Enrichment DNA-Methylation Sequencing , 2013, PloS one.

[3]  M. Robinson,et al.  A scaling normalization method for differential expression analysis of RNA-seq data , 2010, Genome Biology.

[4]  Lee E. Edsall,et al.  Human DNA methylomes at base resolution show widespread epigenomic differences , 2009, Nature.

[5]  H. Bayley,et al.  Continuous base identification for single-molecule nanopore DNA sequencing. , 2009, Nature nanotechnology.

[6]  Susan J Clark,et al.  DNA methylation and gene silencing in cancer: which is the guilty party? , 2002, Oncogene.

[7]  G. Tsujimoto,et al.  Genome-wide analysis of aberrant methylation in human breast cancer cells using methyl-DNA immunoprecipitation combined with high-throughput sequencing , 2010, BMC Genomics.

[8]  David Serre,et al.  MBD-isolated Genome Sequencing provides a high-throughput and comprehensive survey of DNA methylation in the human genome , 2009, Nucleic acids research.

[9]  P. Laird Principles and challenges of genome-wide DNA methylation analysis , 2010, Nature Reviews Genetics.

[10]  Zachary D. Smith,et al.  Genome-scale DNA methylation mapping of clinical samples at single-nucleotide resolution , 2010, Nature Methods.

[11]  Rafael A. Irizarry,et al.  Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays , 2014, Bioinform..

[12]  Boris Lenhard,et al.  Genome-wide DNA methylation profiling of non-small cell lung carcinomas , 2012, Epigenetics & Chromatin.

[13]  Richard A. Stein,et al.  Epigenetics--the link between infectious diseases and cancer. , 2011, JAMA.

[14]  Lu Zhang,et al.  Comparative methylome analysis of benign and malignant peripheral nerve sheath tumors. , 2011, Genome research.

[15]  Dario Strbenac,et al.  Copy-number-aware differential analysis of quantitative DNA sequencing data , 2012, Genome research.

[16]  Dario Strbenac,et al.  Comparison of methyl-DNA immunoprecipitation (MeDIP) and methyl-CpG binding domain (MBD) protein capture for genome-wide DNA methylation analysis reveal CpG sequence coverage bias , 2011, Epigenetics.

[17]  A. Feinberg,et al.  Increased methylation variation in epigenetic domains across cancer types , 2011, Nature Genetics.

[18]  Nengjun Yi,et al.  Statistical Quantification of Methylation Levels by Next-Generation Sequencing , 2011, PloS one.

[19]  Stephan Beck,et al.  Methylome analysis using MeDIP-seq with low DNA concentrations , 2012, Nature Protocols.

[20]  Dario Strbenac,et al.  Evaluation of affinity-based genome-wide DNA methylation data: effects of CpG density, amplification bias, and copy number variation. , 2010, Genome research.

[21]  Natalie Jäger,et al.  Genome-wide mapping of DNA methylation : a quantitative technology comparison , 2012 .

[22]  P. Deloukas,et al.  A Comparison of the Whole Genome Approach of MeDIP-Seq to the Targeted Approach of the Infinium HumanMethylation450 BeadChip® for Methylome Profiling , 2012, PloS one.

[23]  Joseph K. Pickrell,et al.  False positive peaks in ChIP-seq and other sequencing-based functional assays caused by unannotated high copy number regions , 2011, Bioinform..

[24]  Margaret R. Karagas,et al.  Copy number variation has little impact on bead-array-based measures of DNA methylation , 2009, Bioinform..

[25]  K. Hansen,et al.  Removing technical variability in RNA-seq data using conditional quantile normalization , 2012, Biostatistics.

[26]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[27]  Jeffrey B. Cheng,et al.  Estimating absolute methylation levels at single-CpG resolution from methylation enrichment and restriction enzyme sequencing methods , 2013, RECOMB.

[28]  Peter A. Jones,et al.  A decade of exploring the cancer epigenome — biological and translational implications , 2011, Nature Reviews Cancer.

[29]  Natalie Jäger,et al.  Genome-wide mapping of DNA methylation: a quantitative technology comparison , 2010, Nature Biotechnology.

[30]  S. Swamy,et al.  PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data , 2009, Biostatistics.

[31]  R. Lister,et al.  Finding the fifth base: genome-wide sequencing of cytosine methylation. , 2009, Genome research.

[32]  Ralf Herwig,et al.  Computational analysis of genome-wide DNA methylation during the differentiation of human embryonic stem cells along the endodermal lineage. , 2010, Genome research.

[33]  Gavin D. Meredith,et al.  High Resolution Detection and Analysis of CpG Dinucleotides Methylation Using MBD-Seq Technology , 2011, PloS one.

[34]  A. Urban,et al.  MEDME: an experimental and analytical methodology for the estimation of DNA methylation levels based on microarray derived MeDIP-enrichment. , 2008, Genome research.

[35]  Howard Slomko,et al.  Minireview: Epigenetics of obesity and diabetes in humans. , 2012, Endocrinology.

[36]  K. Gunderson,et al.  High density DNA methylation array with single CpG site resolution. , 2011, Genomics.

[37]  Tyson A. Clark,et al.  Direct detection of DNA methylation during single-molecule, real-time sequencing , 2010, Nature Methods.

[38]  S. Clark,et al.  High sensitivity mapping of methylated cytosines. , 1994, Nucleic acids research.

[39]  David C. Schmittlein,et al.  Technical Note---Why Does the NBD Model Work? Robustness in Representing Product Purchases, Brand Purchases and Imperfectly Recorded Purchases , 1985 .

[40]  Trupti Joshi,et al.  Targeted bisulfite sequencing by solution hybrid selection and massively parallel sequencing , 2011, Nucleic acids research.

[41]  A. Gnirke,et al.  Charting a dynamic DNA methylation landscape of the human genome , 2013, Nature.

[42]  M. Kerick,et al.  Generation and Analysis of Genome-Wide DNA Methylation Maps , 2012 .

[43]  M. Abramowitz,et al.  Handbook of Mathematical Functions With Formulas, Graphs and Mathematical Tables (National Bureau of Standards Applied Mathematics Series No. 55) , 1965 .

[44]  Peter A. Jones Functions of DNA methylation: islands, start sites, gene bodies and beyond , 2012, Nature Reviews Genetics.

[45]  R. Durbin,et al.  A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis , 2008, Nature Biotechnology.

[46]  Peter A. Jones,et al.  The Epigenomics of Cancer , 2007, Cell.

[47]  Peter S. Fader,et al.  A note on modelling underreported Poisson counts , 2000 .

[48]  M. Esteller Cancer epigenomics: DNA methylomes and histone-modification maps , 2007, Nature Reviews Genetics.

[49]  Daniel Adkins,et al.  MBD-seq as a cost-effective approach for methylome-wide association studies: demonstration in 1500 case--control samples. , 2012, Epigenomics.

[50]  T. Speed,et al.  Protocol matters: which methylome are you actually studying? , 2010, Epigenomics.

[51]  Rainer Winkelmann,et al.  Markov chain Monte Carlo analysis of underreported count data with an application to worker absenteeism , 1996 .

[52]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[53]  Dario Strbenac,et al.  Repitools: an R package for the analysis of enrichment-based epigenomic data , 2010, Bioinform..