Age-adjusted nonparametric detection of differential DNA methylation with case–control designs

BackgroundDNA methylation profiles differ among disease types and, therefore, can be used in disease diagnosis. In addition, large-scale whole genome DNA methylation data offer tremendous potential in understanding the role of DNA methylation in normal development and function. However, due to the unique feature of the methylation data, powerful and robust statistical methods are very limited in this area.ResultsIn this paper, we proposed and examined a new statistical method to detect differentially methylated loci for case control designs that is fully nonparametric and does not depend on any assumption for the underlying distribution of the data. Moreover, the proposed method adjusts for the age effect that has been shown to be highly correlated with DNA methylation profiles. Using simulation studies and a real data application, we have demonstrated the advantages of our method over existing commonly used methods.ConclusionsCompared to existing methods, our method improved the detection power for differentially methylated loci for case control designs and controlled the type I error well. Its applications are not limited to methylation data; it can be extended to many other case-control studies.

[1]  Hon Keung Tony Ng,et al.  Design and analysis of multiple diseases genome-wide association studies without controls. , 2012, Gene.

[2]  Derek Y. Chiang,et al.  Integrating Prior Knowledge in Multiple Testing under Dependence with Applications to Detecting Differential DNA Methylation , 2012, Biometrics.

[3]  Qingzhong Liu,et al.  A New Approach to Account for the Correlations among Single Nucleotide Polymorphisms in Genome-Wide Association Studies , 2011, Human Heredity.

[4]  Saralees Nadarajah,et al.  Statistical methods on detecting differentially expressed genes for RNA-seq data , 2011, BMC Systems Biology.

[5]  Peter W. Laird,et al.  A comparison of cluster analysis methods using DNA methylation data , 2004, Bioinform..

[6]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[7]  Qingzhong Liu,et al.  Identifying Differentially Expressed Genes based on probe level data for GeneChip arrays , 2010, Int. J. Comput. Biol. Drug Des..

[8]  Margaret R. Karagas,et al.  Model-based clustering of DNA methylation array data: a recursive-partitioning algorithm for high-dimensional data arising as a mixture of beta distributions , 2008, BMC Bioinformatics.

[9]  H. O. Lancaster THE COMBINATION OF PROBABILITIES: AN APPLICATION OF ORTHONORMAL FUNCTIONS , 1961 .

[10]  Markus Neuhäuser,et al.  Exact Tests for the Analysis of Case-Control Studies of Genetic Markers , 2003, Human Heredity.

[11]  Peter A. Jones,et al.  The fundamental role of epigenetic events in cancer , 2002, Nature Reviews Genetics.

[12]  J A Koziol,et al.  Comments on ‘Choosing an optimal method to combine P‐values’ by S. Won, N. Morris, Q. Liu, R. C. Elston, Statistics in Medicine 2009; 28:1537–1553 , 2009, Statistics in medicine.

[13]  N. Tommerup,et al.  Chromosome instability and immunodeficiency syndrome caused by mutations in a DNA methyltransferase gene , 1999, Nature.

[14]  Zhongxue Chen A new association test based on Chi‐square partition for case‐control GWA studies , 2011, Genetic epidemiology.

[15]  Wolfgang Wagner,et al.  Age-dependent DNA methylation of genes that are suppressed in stem cells is a hallmark of cancer. , 2010, Genome research.

[16]  A. Feinberg,et al.  The history of cancer epigenetics , 2004, Nature Reviews Cancer.

[17]  Art B. Owen,et al.  Karl Pearson’s meta analysis revisited , 2009, 0911.3531.

[18]  Werner Baumgartner,et al.  A Nonparametric Test for the General Two-Sample Problem , 1998 .

[19]  Saralees Nadarajah,et al.  Comments on 'Choosing an optimal method to combine p-values' by Sungho Won, Nathan Morris, Qing Lu and Robert C. Elston, Statistics in Medicine 2009; 28:1537-1553. , 2011, Statistics in medicine.

[20]  BMC Bioinformatics , 2005 .

[21]  Z Chen,et al.  Is the weighted z‐test the best method for combining probabilities from independent tests? , 2011, Journal of evolutionary biology.

[22]  Qingzhong Liu,et al.  A new statistical approach to detecting differentially methylated loci for case control Illumina array methylation data , 2012, Bioinform..

[23]  P. Laird Principles and challenges of genome-wide DNA methylation analysis , 2010, Nature Reviews Genetics.

[24]  Hon Keung Tony Ng,et al.  A Robust Method for Testing Association in Genome-Wide Association Studies , 2011, Human Heredity.

[25]  Xin Zhou,et al.  A statistical framework for Illumina DNA methylation arrays , 2010, Bioinform..

[26]  R. Fisher,et al.  Statistical Methods for Research Workers , 1930, Nature.

[27]  Li Yu,et al.  [DNA methylation and cancer]. , 2005, Zhonghua nei ke za zhi.

[28]  Saralees Nadarajah,et al.  Detecting differentially methylated loci for Illumina Array methylation data based on human ovarian cancer data , 2013, BMC Medical Genomics.

[29]  Shuang Wang,et al.  Method to detect differentially methylated loci with case‐control designs using Illumina arrays , 2011, Genetic epidemiology.

[30]  Kamel Jabbari,et al.  Cytosine methylation and CpG, TpG (CpA) and TpA frequencies. , 2004, Gene.

[31]  E. Birney,et al.  An integrated resource for genome-wide identification and analysis of human tissue-specific differentially methylated regions (tDMRs). , 2008, Genome research.

[32]  Qingzhong Liu,et al.  A gene selection method for GeneChip array data with small sample sizes , 2011, BMC Genomics.

[33]  S. Baylin,et al.  Epigenetic gene silencing in cancer – a mechanism for early oncogenic pathway addiction? , 2006, Nature Reviews Cancer.

[34]  B. Christensen,et al.  Aging and Environmental Exposures Alter Tissue-Specific DNA Methylation Dependent upon CpG Island Context , 2009, PLoS genetics.