A Composite Framework for the Statistical Analysis of Epidemiological DNA Methylation Data with the Infinium Human Methylation 450K BeadChip

High-throughput DNA methylation profiling exploits microarray technologies thus providing a wealth of data, which however solicits rigorous, generic, and analytical pipelines for an efficient systems level analysis and interpretation. In this study, we utilize the Illumina's Infinium Human Methylation 450K BeadChip platform in an epidemiological cohort, targeting to associate interesting methylation patterns with breast cancer predisposition. The computational framework proposed here extends the-established in transcriptomic microarrays-logarithmic ratio of the methylated versus the unmethylated signal intensities, quoted as M -value. Moreover, intensity-based correction of the M-signal distribution is introduced in order to correct for batch effects and probe-specific errors in intensity measurements. This is accomplished through the estimation of intensity-related error measures from quality control samples included in each chip. Moreover, robust statistical measures exploiting the coefficient variation of DNA methylation measurements between control and case samples alleviate the impact of technical variation. The results presented here are juxtaposed to those derived by applying classical preprocessing and statistical selection methodologies. Overall, in comparison to traditional approaches, the superior performance of the proposed framework in terms of technical bias correction, along with its generic character, support its suitability for various microarray technologies.

[1]  P. Laird Principles and challenges of genome-wide DNA methylation analysis , 2010, Nature Reviews Genetics.

[2]  Peter A. Jones,et al.  Footprinting of mammalian promoters: use of a CpG DNA methyltransferase revealing nucleosome positions at a single molecule level , 2005, Nucleic acids research.

[3]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[4]  Philippe Hupé,et al.  SMETHILLIUM: spatial normalization METHod for ILLumina InfinIUM HumanMethylation BeadChip , 2011, Bioinform..

[5]  K. Gunderson,et al.  Genome-wide DNA methylation profiling using Infinium® assay. , 2009, Epigenomics.

[6]  Xiaofeng Cao,et al.  Interplay between Two Epigenetic Marks DNA Methylation and Histone H3 Lysine 9 Methylation , 2002, Current Biology.

[7]  Dan Wang,et al.  IMA: an R package for high-throughput analysis of Illumina's 450K Infinium methylation data , 2012, Bioinform..

[8]  Kimberly D. Siegmund,et al.  Statistical approaches for the analysis of DNA methylation microarray data , 2011, Human Genetics.

[9]  Zhijin Wu,et al.  Accurate genome-scale percentage DNA methylation estimates from microarray data. , 2011, Biostatistics.

[10]  T. Spector,et al.  DNA methylation profiling in breast cancer discordant identical twins identifies DOK7 as novel epigenetic biomarker , 2012, Carcinogenesis.

[11]  Shiwen Xu,et al.  Effects of subchronic cadmium poisoning on DNA methylation in hens. , 2009, Environmental toxicology and pharmacology.

[12]  Ronald W. Davis,et al.  Genome-Wide Transcriptional Analysis of Aerobic and Anaerobic Chemostat Cultures of Saccharomyces cerevisiae , 1999, Journal of bacteriology.

[13]  J. Herman,et al.  Cancer as an epigenetic disease: DNA methylation and chromatin alterations in human tumours , 2002, The Journal of pathology.

[14]  A. Teschendorff,et al.  An Epigenetic Signature in Peripheral Blood Predicts Active Ovarian Cancer , 2009, PloS one.

[15]  K. V. Donkena,et al.  Batch effect correction for genome-wide methylation data with Illumina Infinium platform , 2011, BMC Medical Genomics.

[16]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[17]  Zhijin Wu,et al.  Subset Quantile Normalization Using Negative Control Features , 2010, J. Comput. Biol..

[18]  Xiao Zhang,et al.  Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis , 2010, BMC Bioinformatics.

[19]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[20]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[21]  A. Bird DNA methylation patterns and epigenetic memory. , 2002, Genes & development.

[22]  M. Esteller,et al.  Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome , 2011, Epigenetics.