Estimating and testing high-dimensional mediation effects in epigenetic studies

MOTIVATION High-dimensional DNA methylation markers may mediate pathways linking environmental exposures with health outcomes. However, there is a lack of analytical methods to identify significant mediators for high-dimensional mediation analysis. RESULTS Based on sure independent screening and minimax concave penalty techniques, we use a joint significance test for mediation effect. We demonstrate its practical performance using Monte Carlo simulation studies and apply this method to investigate the extent to which DNA methylation markers mediate the causal pathway from smoking to reduced lung function in the Normative Aging Study. We identify 2 CpGs with significant mediation effects. AVAILABILITY AND IMPLEMENTATION R package, source code, and simulation study are available at https://github.com/YinanZheng/HIMA CONTACT: lei.liu@northwestern.edu.

[1]  Shanshan Zhao,et al.  Covariate measurement error correction methods in mediation analysis with failure time data , 2014, Biometrics.

[2]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[3]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[4]  Christian Gieger,et al.  Tobacco Smoking Leads to Extensive Genome-Wide Changes in DNA Methylation , 2013, PloS one.

[5]  Marshall M Joffe,et al.  A review of causal estimation of effects in mediation analyses , 2012, Statistical methods in medical research.

[6]  Jeffrey M Albert,et al.  Distribution-free mediation analysis for nonlinear models with confounding. , 2012, Epidemiology.

[7]  Zongli Xu,et al.  CpG Sites Associated with Cigarette Smoking: Analysis of Epigenome-Wide Data from the Sister Study , 2014, Environmental health perspectives.

[8]  Wei Zhang,et al.  Genome-Wide Variation of Cytosine Modifications Between European and African Populations and the Implications for Complex Traits , 2013, Genetics.

[9]  Paolo Vineis,et al.  Tobacco smoking-associated genome-wide DNA methylation changes in the EPIC study. , 2016, Epigenomics.

[10]  Martin J. Aryee,et al.  Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in Rheumatoid Arthritis , 2013, Nature Biotechnology.

[11]  D. Mackinnon Introduction to Statistical Mediation Analysis , 2008 .

[12]  Antonio Gasparrini,et al.  Air pollution and gene-specific methylation in the Normative Aging Study , 2014, Epigenetics.

[13]  Xiao Zhang,et al.  Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis , 2010, BMC Bioinformatics.

[14]  W Y Zhang,et al.  Discussion on `Sure independence screening for ultra-high dimensional feature space' by Fan, J and Lv, J. , 2008 .

[15]  David P MacKinnon,et al.  Four applications of permutation methods to testing a single-mediator model , 2012, Behavior research methods.

[16]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[17]  Kristopher J Preacher,et al.  Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models , 2008, Behavior research methods.

[18]  Devin C. Koestler,et al.  DNA methylation arrays as surrogate measures of cell mixture distribution , 2012, BMC Bioinformatics.

[19]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[20]  Jian Huang,et al.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION. , 2011, The annals of applied statistics.

[21]  Joshua N. Sampson,et al.  Testing multiple biological mediators simultaneously , 2014, Bioinform..

[22]  H. Brenner,et al.  DNA methylation changes of whole blood cells in response to active smoking exposure in adults: a systematic review of DNA methylation studies , 2015, Clinical Epigenetics.

[23]  Xihong Lin,et al.  Mediation analysis when a continuous mediator is measured with error and the outcome follows a generalized linear model , 2014, Statistics in medicine.

[24]  N. Meinshausen,et al.  High-Dimensional Inference: Confidence Intervals, $p$-Values and R-Software hdi , 2014, 1408.4026.

[25]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[26]  J. Pearl The Causal Mediation Formula—A Guide to the Assessment of Pathways and Mechanisms , 2012, Prevention Science.

[27]  S. West,et al.  A comparison of methods to test mediation and other intervening variable effects. , 2002, Psychological methods.

[28]  Lijuan Wang,et al.  Methods for Mediation Analysis with Missing Data , 2012, Psychometrika.

[29]  Yan Li,et al.  Confounding in the estimation of mediation effects , 2007, Comput. Stat. Data Anal..

[30]  Kristopher J Preacher,et al.  Advances in mediation analysis: a survey and synthesis of new developments. , 2015, Annual review of psychology.

[31]  Zhiyong Zhang,et al.  Estimating and Testing Mediation Effects with Censored Data , 2011 .