Differential variability improves the identification of cancer risk markers in DNA methylation studies profiling precursor cancer lesions

MOTIVATION The standard paradigm in omic disciplines has been to identify biologically relevant biomarkers using statistics that reflect differences in mean levels of a molecular quantity such as mRNA expression or DNA methylation. Recently, however, it has been proposed that differential epigenetic variability may mark genes that contribute to the risk of complex genetic diseases like cancer and that identification of risk and early detection markers may therefore benefit from statistics based on differential variability. RESULTS Using four genome-wide DNA methylation datasets totalling 311 epithelial samples and encompassing all stages of cervical carcinogenesis, we here formally demonstrate that differential variability, as a criterion for selecting DNA methylation features, can identify cancer risk markers more reliably than statistics based on differences in mean methylation. We show that differential variability selects features with heterogeneous outlier methylation profiles and that these play a key role in the early stages of carcinogenesis. Moreover, differentially variable features identified in precursor non-invasive lesions exhibit significantly increased enrichment for developmental genes compared with differentially methylated sites. Conversely, differential variability does not add predictive value in cancer studies profiling invasive tumours or whole-blood tissue. Finally, we incorporate the differential variability feature selection step into a novel adaptive index prediction algorithm called EVORA (epigenetic variable outliers for risk prediction analysis), and demonstrate that EVORA compares favourably to powerful prediction algorithms based on differential methylation statistics. CONCLUSIONS Statistics based on differential variability improve the detection of cancer risk markers in the context of DNA methylation studies profiling epithelial preinvasive neoplasias. We present a novel algorithm (EVORA) which could be used for prediction and diagnosis of precursor epithelial cancer lesions. AVAILABILITY R-scripts implementing EVORA are available from CRAN (www.r-project.org).

[1]  Lu Tian,et al.  Adaptive index models for marker-based risk stratification. , 2011, Biostatistics.

[2]  K. Gunderson,et al.  Genome-wide DNA methylation profiling using Infinium® assay. , 2009, Epigenomics.

[3]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  A. Teschendorff,et al.  An Epigenetic Signature in Peripheral Blood Predicts Active Ovarian Cancer , 2009, PloS one.

[5]  Jeffrey T Leek,et al.  Significance analysis and statistical dissection of variably methylated regions. , 2012, Biostatistics.

[6]  A. Feinberg,et al.  Increased methylation variation in epigenetic domains across cancer types , 2011, Nature Genetics.

[7]  Wolfgang Wagner,et al.  Age-dependent DNA methylation of genes that are suppressed in stem cells is a hallmark of cancer. , 2010, Genome research.

[8]  A. Feinberg,et al.  Stochastic epigenetic variation as a driving force of development, evolutionary adaptation, and disease , 2010, Proceedings of the National Academy of Sciences.

[9]  R. Tibshirani,et al.  Semi-Supervised Methods to Predict Patient Survival from Gene Expression Data , 2004, PLoS biology.

[10]  Kelly M. McGarvey,et al.  A stem cell–like chromatin pattern may predispose tumor suppressor genes to DNA hypermethylation and heritable silencing , 2007, Nature Genetics.

[11]  Kevin R. Coombes,et al.  Identifying Differentially Expressed Genes in cDNA Microarray Experiments , 2001, J. Comput. Biol..

[12]  G. W. Snedecor Statistical Methods , 1964 .

[13]  T P Speed,et al.  Identifying differentially expressed genes in cDNA microarray experiments authors. , 2001, Science of aging knowledge environment : SAGE KE.

[14]  Zohar Yakhini,et al.  Polycomb-mediated methylation on Lys27 of histone H3 pre-marks genes for de novo methylation in cancer , 2007, Nature Genetics.

[15]  P. Laird,et al.  Epigenetic stem cell signature in cancer , 2007, Nature Genetics.

[16]  Henry C Kitchener,et al.  HPV testing in combination with liquid-based cytology in primary cervical screening (ARTISTIC): a randomised controlled trial. , 2009, The Lancet. Oncology.

[17]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  J. Tchinda,et al.  Recurrent Fusion of TMPRSS2 and ETS Transcription Factor Genes in Prostate Cancer , 2005, Science.

[19]  A Gray,et al.  Blood glucose self-monitoring in type 2 diabetes: a randomised controlled trial. , 2009, Health technology assessment.

[20]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[21]  J. Peto,et al.  ARTISTIC: a randomised trial of human papillomavirus (HPV) testing in primary cervical screening. , 2009, Health technology assessment.

[22]  Gordon K. Smyth,et al.  limmaGUI: A graphical user interface for linear modeling of microarray data , 2004, Bioinform..

[23]  Megan F. Cole,et al.  Control of Developmental Regulators by Polycomb in Human Embryonic Stem Cells , 2006, Cell.

[24]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[25]  M. Esteller,et al.  Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome , 2011, Epigenetics.

[26]  Martin J. Aryee,et al.  Personalized Epigenomic Signatures That Are Stable Over Time and Covary with Body Mass Index , 2010, Science Translational Medicine.

[27]  J. Issa,et al.  Epigenetic variation and cellular Darwinism , 2011, Nature Genetics.

[28]  H. Kitchener,et al.  The Dynamics and Prognostic Potential of DNA Methylation Changes at Stem Cell Gene Loci in Women's Cancer , 2012, PLoS genetics.