Transform-Domain Classification of Human Cells based on DNA Methylation Datasets

A novel method to classify human cells is presented in this work based on the transform-domain method on DNA methylation data. DNA methylation profile variations are observed in human cells with the progression of disease stages, and the proposal is based on this DNA methylation variation to classify normal and disease cells including cancer cells. The cancer cell types investigated in this work cover hepatocellular (sample size n = 40), colorectal (n = 44), lung (n = 70) and endometrial (n = 87) cancer cells. A new pipeline is proposed integrating the DNA methylation intensity measurements on all the CpG islands by the transformation of Walsh-Hadamard Transform (WHT). The study reveals the three-step properties of the DNA methylation transform-domain data and the step values of association with the cell status. Further assessments have been carried out on the proposed machine learning pipeline to perform classification of the normal and cancer tissue cells. A number of machine learning classifiers are compared for whole sequence and WHT sequence classification based on public Whole-Genome Bisulfite Sequencing (WGBS) DNA methylation datasets. The WHT-based method can speed up the computation time by more than one order of magnitude compared with whole original sequence classification, while maintaining comparable classification accuracy by the selected machine learning classifiers. The proposed method has broad applications in expedited disease and normal human cell classifications by the epigenome and genome datasets.

[1]  Manolis Kellis,et al.  Multi-scale chromatin state annotation using a hierarchical hidden Markov model , 2017, Nature Communications.

[2]  Giovanni Felici,et al.  Classifying Big DNA Methylation Data: A Gene-Oriented Approach , 2018, DEXA Workshops.

[3]  Dennis Kostka,et al.  Modeling DNA methylation dynamics with approaches from phylogenetics , 2014, Bioinform..

[4]  Lan Hu,et al.  A novel strategy for forensic age prediction by DNA methylation and support vector regression model , 2015, Scientific Reports.

[5]  Rondi A. Butler,et al.  CpG island methylation profile in non-invasive oral rinse samples is predictive of oral and pharyngeal carcinoma , 2015, Clinical Epigenetics.

[6]  Dario Pompili,et al.  Walsh-hadamard transform of DNA methylation profile for the classification of human cancer cells , 2017, BIOINFORMATICS 2017.

[7]  Ying Ding,et al.  DNA methylation age is not accelerated in brain or blood of subjects with schizophrenia , 2017, Schizophrenia Research.

[8]  Alexander Meissner,et al.  Association of Brain DNA methylation in SORL1, ABCA7, HLA-DRB5, SLC24A4, and BIN1 with pathological diagnosis of Alzheimer disease. , 2015, JAMA neurology.

[9]  Lu Zong,et al.  Variation in global DNA hydroxymethylation with age associated with schizophrenia , 2017, Psychiatry Research.

[10]  E P Noble,et al.  Genome-wide DNA methylation analysis of human brain tissue from schizophrenia patients , 2014, Translational Psychiatry.

[11]  D. Gifford,et al.  Predicting the impact of non-coding variants on DNA methylation , 2016 .

[12]  Fabio Cumbo,et al.  Classification of large DNA methylation datasets for identifying cancer drivers , 2018, Big Data Res..

[13]  Chibo Hong,et al.  DNA Methylation and Somatic Mutations Converge on the Cell Cycle and Define Similar Evolutionary Histories in Brain Tumors. , 2015, Cancer cell.

[14]  Zheng Guo,et al.  Application of the rank-based method to DNA methylation for cancer diagnosis. , 2015, Gene.

[15]  Vince D. Calhoun,et al.  Integrating Imaging Genomic Data in the Quest for Biomarkers of Schizophrenia Disease , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  O. Stegle,et al.  DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning , 2016, Genome Biology.

[17]  Daniela Mari,et al.  Global changes in DNA methylation in Alzheimer’s disease peripheral blood mononuclear cells , 2015, Brain, Behavior, and Immunity.

[18]  Giovanni Felici,et al.  Combining DNA methylation and RNA sequencing data of cancer for supervised knowledge extraction , 2018, BioData Mining.

[19]  Xiao Zhang,et al.  Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis , 2010, BMC Bioinformatics.

[20]  Abd Rahim Nour El Huda,et al.  DNA methylation of membrane‐bound catechol‐O‐methyltransferase in Malaysian schizophrenia patients , 2018, Psychiatry and clinical neurosciences.

[21]  Dong Xu,et al.  Predicting DNA Methylation State of CpG Dinucleotide Using Genome Topological Features and Deep Networks , 2016, Scientific Reports.

[22]  Mattias Landfors,et al.  DNA Methylation Adds Prognostic Value to Minimal Residual Disease Status in Pediatric T‐Cell Acute Lymphoblastic Leukemia , 2016, Pediatric blood & cancer.

[23]  Daniel J. Weisenberger,et al.  Insights into the Pathogenesis of Anaplastic Large-Cell Lymphoma through Genome-wide DNA Methylation Profiling , 2016, Cell reports.

[24]  Manolis Kellis,et al.  Alzheimer's disease: early alterations in brain DNA methylation at ANK1, BIN1, RHBDF2 and other loci , 2014 .

[25]  Masahiro Yao,et al.  DNA methylation profiling distinguishes histological subtypes of renal cell carcinoma , 2013, Epigenetics.

[26]  Manolis Kellis,et al.  Chromatin-state discovery and genome annotation with ChromHMM , 2017, Nature Protocols.

[27]  Chunling Zhang,et al.  Correlation between DNA methylation and gene expression in the brains of patients with bipolar disorder and schizophrenia , 2014, Bipolar disorders.

[28]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[29]  Torbjörn K. Nilsson,et al.  Epigenetic changes as prognostic predictors in endometrial carcinomas , 2017, Epigenetics.

[30]  Vessela Kristensen,et al.  Genome‐wide DNA methylation analyses in lung adenocarcinomas: Association with EGFR, KRAS and TP53 mutation status, gene expression and prognosis , 2015, Molecular oncology.

[31]  John T. Poirier,et al.  DNA methylation in small cell lung cancer defines distinct disease subtypes and correlates with high expression of EZH2 , 2015 .

[32]  Stephen C. J. Parker,et al.  BoostMe accurately predicts DNA methylation values in whole-genome bisulfite sequencing of multiple human tissues , 2017 .