Jllumina - A comprehensive Java-based API for statistical Illumina Infinium HumanMethylation450 and Infinium MethylationEPIC BeadChip data processing

Summary Measuring differential methylation of the DNA is the nowadays most common approach to linking epigenetic modifications to diseases (called epigenome-wide association studies, EWAS). For its low cost, its efficiency and easy handling, the Illumina HumanMethylation450 BeadChip and its successor, the Infinium MethylationEPIC BeadChip, is the by far most popular techniques for conduction EWAS in large patient cohorts. Despite the popularity of this chip technology, raw data processing and statistical analysis of the array data remains far from trivial and still lacks dedicated software libraries enabling high quality and statistically sound downstream analyses. As of yet, only R-based solutions are freely available for low-level processing of the Illumina chip data. However, the lack of alternative libraries poses a hurdle for the development of new bioinformatic tools, in particular when it comes to web services or applications where run time and memory consumption matter, or EWAS data analysis is an integrative part of a bigger framework or data analysis pipeline. We have therefore developed and implemented Jllumina, an open-source Java library for raw data manipulation of Illumina Infinium HumanMethylation450 and Infinium MethylationEPIC BeadChip data, supporting the developer with Java functions covering reading and preprocessing the raw data, down to statistical assessment, permutation tests, and identification of differentially methylated loci. Jllumina is fully parallelizable and publicly available at http://dimmer.compbio.sdu.dk/download.html

[1]  Thomas Lengauer,et al.  Comprehensive Analysis of DNA Methylation Data with RnBeads , 2014, Nature Methods.

[2]  R. Irizarry,et al.  Accounting for cellular heterogeneity is critical in epigenome-wide association studies , 2014, Genome Biology.

[3]  B. Christensen,et al.  Review of processing and analysis methods for DNA methylation array data , 2013, British Journal of Cancer.

[4]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[5]  Qihua Tan,et al.  Efficient detection of differentially methylated regions using DiMmeR , 2016, Bioinform..

[6]  J. Baumbach,et al.  Linking Cytoscape and the corynebacterial reference database CoryneRegNet , 2008, BMC Genomics.

[7]  Jan Baumbach,et al.  KeyPathwayMiner: Detecting Case-Specific Biological Pathways Using Expression Data , 2011, Internet Math..

[8]  Tobias Friedrich,et al.  Efficient key pathway mining: combining networks and OMICS data. , 2012, Integrative biology : quantitative biosciences from nano to macro.

[9]  M. Esteller,et al.  Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome , 2011, Epigenetics.

[10]  K. Robertson DNA methylation and human disease , 2005, Nature Reviews Genetics.

[11]  Ruth Pidsley,et al.  A data-driven approach to preprocessing Illumina 450K methylation array data , 2013, BMC Genomics.

[12]  J. Tost,et al.  Complete pipeline for Infinium(®) Human Methylation 450K BeadChip data processing using subset quantile normalization for accurate DNA methylation estimation. , 2012, Epigenomics.

[13]  Devin C. Koestler,et al.  DNA methylation arrays as surrogate measures of cell mixture distribution , 2012, BMC Bioinformatics.

[14]  Yogendra P. Chaubey Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment , 1993 .

[15]  Mads Thomassen,et al.  Differential DNA methylation patterns of polycystic ovarian syndrome in whole blood of Chinese women , 2016, Oncotarget.

[16]  Rafael A. Irizarry,et al.  Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays , 2014, Bioinform..

[17]  Zachary D. Smith,et al.  DNA methylation: roles in mammalian development , 2013, Nature Reviews Genetics.

[18]  M. Esteller,et al.  Validation of a DNA methylation microarray for 850,000 CpG sites of the human genome enriched in enhancer sequences , 2015, Epigenomics.