Analysis of DNA methylation epidemiological data through a generic composite statistical framework

DNA methylation events represent epigenetic heritable modifications that regulate gene expression by affecting chromatin remodeling. They are encountered more often in CpG rich promoter regions, while they do not alter the DNA sequence itself. High-volume DNA methylation profiling methods exploit microarray technologies and provide a wealth of data. This data solicits rigorous, generic, yet ad-hoc adjusted, analytical pipelines for the meaningful systems-level analysis and interpretation. In this work, the Illumina Infinium HumanMethylation450 BeadChip platform is utilized in an epidemiological cohort from Italy in an effort to correlate interesting methylation patterns with breast cancer predisposition. The composite computational framework proposed here builds upon well established, analytical techniques, employed in mRNA analysis. For analysis purposes, the log2(ratio) of the intensities of a Methylated probe (IMeth) versus an UnMethylated probe (IUn-Meth), quoted as M-value, is used. Intensity based correction of the M-signal distribution is systematically applied, based upon Intensity-related error measures from quality controls samples incorporated in each chip. Thus, batch effects are corrected, while probe-specific, intensity-related, error measures are considered too. Robust, (based on bootstrapping) statistical measures measuring biological variation at the probe level, are derived in order to propose candidate biomarkers. To this end, coefficient variation measurements of DNA methylation between controls and cases are utilized, alleviating simultaneously the impact of technical variation, and are juxtaposed to classical statistical differential analysis measures.

[1]  P. Laird Principles and challenges of genome-wide DNA methylation analysis , 2010, Nature Reviews Genetics.

[2]  Philippe Hupé,et al.  SMETHILLIUM: spatial normalization METHod for ILLumina InfinIUM HumanMethylation BeadChip , 2011, Bioinform..

[3]  A. Teschendorff,et al.  An Epigenetic Signature in Peripheral Blood Predicts Active Ovarian Cancer , 2009, PloS one.

[4]  K. Gunderson,et al.  Genome-wide DNA methylation profiling using Infinium® assay. , 2009, Epigenomics.

[5]  Zhijin Wu,et al.  Accurate genome-scale percentage DNA methylation estimates from microarray data. , 2011, Biostatistics.

[6]  Zhijin Wu,et al.  Subset Quantile Normalization Using Negative Control Features , 2010, J. Comput. Biol..

[7]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[8]  Xiao Zhang,et al.  Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis , 2010, BMC Bioinformatics.

[9]  T. Gingeras,et al.  Microarray-based DNA methylation profiling: technology and applications , 2022 .

[10]  S. Henikoff,et al.  Epigenomic profiling using microarrays. , 2003, BioTechniques.

[11]  K. V. Donkena,et al.  Batch effect correction for genome-wide methylation data with Illumina Infinium platform , 2011, BMC Medical Genomics.

[12]  Dan Wang,et al.  IMA: an R package for high-throughput analysis of Illumina's 450K Infinium methylation data , 2012, Bioinform..

[13]  Kimberly D. Siegmund,et al.  Statistical approaches for the analysis of DNA methylation microarray data , 2011, Human Genetics.

[14]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[15]  Aristotelis A. Chatziioannou,et al.  Exploiting Statistical Methodologies and Controlled Vocabularies for Prioritized Functional Analysis of Genomic Experiments: the StRAnGER Web Application , 2011, Front. Neurosci..

[16]  M. Esteller,et al.  Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome , 2011, Epigenetics.