Ultra-high dimensional variable selection with application to normative aging study: DNA methylation and metabolic syndrome

BackgroundMetabolic syndrome has become a major public health challenge worldwide. The association between metabolic syndrome and DNA methylation is of great research interest.ResultsWe constructed a binomial model to investigate the association between a metabolic syndrome index and DNA methylation in the Normative Aging Study. We applied the Iterative Sure Independence Screening (ISIS) method with elastic net penalty to DNA methylation levels at 484,548 CpG markers from 659 human subjects, and demonstrated that the screening step in ISIS can significantly improve the performance of the elastic net.ConclusionThe proposed method identifies four CpGs which can be mapped to two biologically relevant and functional genes. Identification of significant CpG markers may potentially have practical implications for disease prevention and treatment.

[1]  T. Ideker,et al.  Genome-wide methylation profiles reveal quantitative views of human aging rates. , 2013, Molecular cell.

[2]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[3]  B. Bell,et al.  The Veterans Administration longitudinal study of healthy aging. , 1966, The Gerontologist.

[4]  Francis R. Bach,et al.  Model-Consistent Sparse Estimation through the Bootstrap , 2009, ArXiv.

[5]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[6]  Wei Zhang,et al.  Genome-Wide Variation of Cytosine Modifications Between European and African Populations and the Implications for Complex Traits , 2013, Genetics.

[7]  Francis R. Bach,et al.  Bolasso: model consistent Lasso estimation through the bootstrap , 2008, ICML '08.

[8]  J. Drzewińska,et al.  Identification and analysis of the promoter region of the human DHCR24 gene: involvement of DNA methylation and histone acetylation , 2011, Molecular Biology Reports.

[9]  Joan Tordjman,et al.  Adipocyte ATP-Binding Cassette G1 Promotes Triglyceride Storage, Fat Mass Growth, and Human Obesity , 2014, Diabetes.

[10]  Jaspinder Kaur A Comprehensive Review on Metabolic Syndrome , 2014, Cardiology research and practice.

[11]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[12]  Paul Zimmet,et al.  [A new international diabetes federation worldwide definition of the metabolic syndrome: the rationale and the results]. , 2005, Revista espanola de cardiologia.

[13]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[14]  Andrew P Feinberg,et al.  Epigenetics at the Crossroads of Genes and the Environment. , 2015, JAMA.

[15]  Xiao Zhang,et al.  Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis , 2010, BMC Bioinformatics.

[16]  Wei Zhang,et al.  Estimating and testing high-dimensional mediation effects in epigenetic studies , 2016, Bioinform..

[17]  W Y Zhang,et al.  Discussion on `Sure independence screening for ultra-high dimensional feature space' by Fan, J and Lv, J. , 2008 .

[18]  S. Horvath DNA methylation age of human tissues and cell types , 2013, Genome Biology.

[19]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[20]  Elina Ikonen,et al.  Desmosterol and DHCR24: unexpected new directions for a terminal step in cholesterol synthesis. , 2013, Progress in lipid research.

[21]  Neil Hall,et al.  After the gold rush , 2013, Genome Biology.

[22]  Paul T. Tarr,et al.  ABCG1 has a critical role in mediating cholesterol efflux to HDL and preventing cellular lipid accumulation. , 2005, Cell metabolism.

[23]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[24]  Hemant K. Tiwari,et al.  Epigenome-Wide Association Study of Fasting Measures of Glucose, Insulin, and HOMA-IR in the Genetics of Lipid Lowering Drugs and Diet Network Study , 2014, Diabetes.

[25]  Devin C. Koestler,et al.  DNA methylation arrays as surrogate measures of cell mixture distribution , 2012, BMC Bioinformatics.

[26]  Andrew J. Brown,et al.  Signaling regulates activity of DHCR24, the final enzyme in cholesterol synthesis[S] , 2014, Journal of Lipid Research.

[27]  P. Laird,et al.  Environmental epigenetics: prospects for studying epigenetic mediation of exposure–response relationships , 2012, Human Genetics.

[28]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[29]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[30]  Yichao Wu,et al.  Ultrahigh Dimensional Feature Selection: Beyond The Linear Model , 2009, J. Mach. Learn. Res..

[31]  D. Hernandez,et al.  DNA Methylation of Lipid-Related Genes Affects Blood Lipid Levels , 2015, Circulation. Cardiovascular genetics.

[32]  P. Laird,et al.  Low-level processing of Illumina Infinium DNA Methylation BeadArrays , 2013, Nucleic acids research.

[33]  Francesco Marabita,et al.  A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data , 2012, Bioinform..

[34]  Yi Li,et al.  PGS: a tool for association study of high-dimensional microRNA expression data with repeated measures , 2014, Bioinform..

[35]  Ina Hoeschele,et al.  Alterations of a Cellular Cholesterol Metabolism Network Are a Molecular Feature of Obesity-Related Type 2 Diabetes and Cardiovascular Disease , 2015, Diabetes.

[36]  Jian Huang,et al.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION. , 2011, The annals of applied statistics.