Evaluation of marker selection methods and statistical models for chronological age prediction based on DNA methylation.

In forensic investigation, retrieving biological information from DNA evidence is a promising field of interest. One of the applications is on the estimation of the age of the donor based on DNA methylation. A large number of studies focused on age prediction using the 450 K Human Methylation Beadchip. Various marker selection methods and prediction models have been considered. However, there is a lack of research evaluating different high-dimensional variable selection methods of CpG sites with various models for age prediction. The aim of this study is to evaluate four variable selection methods (forward selection, LASSO, elastic net and SCAD) combined with a classical statistical model and sophisticated machine learning models based on the mean absolute deviation (MAD) and the root-mean-square error (RMSE). We used publicly available 450 K data set containing 991 whole blood samples (age 19-101 years). We found that the multiple linear regression model with 16 markers selected from the forward selection method performed very well in age prediction (MAD = 3.76 years and RMSE = 5.01 years). On the other hand, the highly advanced ultrahigh dimensional variable selection methods and sophisticated machine learning algorithms appeared unnecessary for age prediction based on DNA methylation.

[1]  R. Płoski,et al.  Development of a forensically useful age prediction method based on DNA methylation analysis. , 2015, Forensic science international. Genetics.

[2]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[3]  Á. Carracedo,et al.  Development of a methylation marker set for forensic age estimation using analysis of public methylation data and the Agena Bioscience EpiTYPER system. , 2016, Forensic science international. Genetics.

[4]  Lan Hu,et al.  A novel strategy for forensic age prediction by DNA methylation and support vector regression model , 2015, Scientific Reports.

[5]  Jong-Lyul Park,et al.  Identification and evaluation of age-correlated DNA methylation markers for forensic use. , 2016, Forensic science international. Genetics.

[6]  Martin J. Aryee,et al.  Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in Rheumatoid Arthritis , 2013, Nature Biotechnology.

[7]  W. Wagner,et al.  Epigenetic-aging-signature to determine age in different tissues , 2011, Aging.

[8]  T. Ideker,et al.  Genome-wide methylation profiles reveal quantitative views of human aging rates. , 2013, Molecular cell.

[9]  R. Decorte,et al.  Evaluation of three statistical prediction models for forensic age prediction based on DNA methylation. , 2018, Forensic science international. Genetics.

[10]  Sae Rom Hong,et al.  DNA methylation-based age prediction from saliva: High age predictability by combination of 7 CpG markers. , 2017, Forensic science international. Genetics.

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  V. Wilson,et al.  Genomic 5-methyldeoxycytidine decreases with age. , 1987, The Journal of biological chemistry.

[13]  V. Klimenko,et al.  The 5-Methylcytosine in DNA of Rats , 1973 .

[14]  Michael H. Kutner Applied Linear Statistical Models , 1974 .

[15]  M. Gallidabino,et al.  DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models. , 2018, Forensic science international. Genetics.

[16]  S. Horvath DNA methylation age of human tissues and cell types , 2013, Genome Biology.

[17]  Sae Rom Hong,et al.  DNA methylation of the ELOVL2, FHL2, KLF14, C1orf132/MIR29B2C, and TRIM59 genes for age prediction from blood, saliva, and buccal swab samples. , 2019, Forensic science international. Genetics.

[18]  Steve Horvath,et al.  Epigenetic Predictor of Age , 2011, PloS one.

[19]  Hwan Young Lee,et al.  Platform-independent models for age prediction using DNA methylation data. , 2019, Forensic science international. Genetics.

[20]  Ulf Gyllensten,et al.  Continuous Aging of the Human DNA Methylome Throughout the Human Lifespan , 2013, PloS one.

[21]  H. Hoefsloot,et al.  Chronological age prediction based on DNA methylation: Massive parallel sequencing and random forest regression. , 2017, Forensic science international. Genetics.

[22]  A. Metspalu,et al.  CpG sites associated with NRP1, NRXN2 and miR-29b-2 are hypomethylated in monocytes during ageing , 2014, Immunity & Ageing.

[23]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[24]  David Ballard,et al.  DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing , 2017, Forensic science international. Genetics.

[25]  Vaniushin Bf,et al.  [Nucleotide composition of DNA and RNA from somatic tissues of humpback and its changes during spawning]. , 1967, Biokhimiia.

[26]  K. Tamaki,et al.  Forensic age prediction for dead or living samples by use of methylation-sensitive high resolution melting. , 2016, Legal medicine.