methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles

DNA methylation is a chemical modification of cytosine bases that is pivotal for gene regulation,cellular specification and cancer development. Here, we describe an R package, methylKit, thatrapidly analyzes genome-wide cytosine epigenetic profiles from high-throughput methylation andhydroxymethylation sequencing experiments. methylKit includes functions for clustering, samplequality visualization, differential methylation analysis and annotation features, thus automatingand simplifying many of the steps for discerning statistically significant bases or regions of DNAmethylation. Finally, we demonstrate methylKit on breast cancer data, in which we find statisticallysignificant regions of differential methylation and stratify tumor subtypes. methylKit is availableat http://code.google.com/p/methylkit.

[1]  Phar,et al.  DEPARTMENT of pharmacology. , 1951, BMQ; the Boston medical quarterly.

[2]  Rudolf Jaenisch,et al.  Role for DNA methylation in genomic imprinting , 1993, Nature.

[3]  A. Bird,et al.  Identification and Characterization of a Family of Mammalian Methyl-CpG Binding Proteins , 1998, Molecular and Cellular Biology.

[4]  A. Bird,et al.  Methylation-Induced Repression— Belts, Braces, and Chromatin , 1999, Cell.

[5]  M. Caligiuri,et al.  Aberrant CpG-island methylation has non-random and tumour-type–specific patterns , 2000, Nature Genetics.

[6]  J. Herman,et al.  DNA hypermethylation in tumorigenesis: epigenetics joins genetics. , 2000, Trends in genetics : TIG.

[7]  J. Herman,et al.  A gene hypermethylation profile of human cancer. , 2001, Cancer research.

[8]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[9]  I. Jolliffe Principal Component Analysis , 2002 .

[10]  John D. Storey A direct approach to false discovery rates , 2002 .

[11]  Philip M. Long,et al.  Breast cancer classification and prognosis based on gene expression profiles from a population-based study , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[12]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[14]  S. Nelson,et al.  Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning , 2008, Nature.

[15]  R. Lister,et al.  Highly Integrated Single-Base Resolution Maps of the Epigenome in Arabidopsis , 2008, Cell.

[16]  T. Mikkelsen,et al.  Genome-scale DNA methylation maps of pluripotent and differentiated cells , 2008, Nature.

[17]  A. Bird,et al.  DNA methylation landscapes: provocative insights from epigenomics , 2008, Nature Reviews Genetics.

[18]  L. Vives,et al.  Genome-wide tracking of unmethylated DNA Alu repeats in normal and cancer cells , 2007, Nucleic acids research.

[19]  Madeleine P. Ball,et al.  Corrigendum: Targeted and genome-scale strategies reveal gene-body methylation signatures in human cells , 2009, Nature Biotechnology.

[20]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[21]  Martin J Aryee,et al.  Differential methylation of tissue- and cancer-specific CpG island shores distinguishes human induced pluripotent stem cells, embryonic stem cells and fibroblasts , 2009, Nature Genetics.

[22]  M. Ehrlich DNA hypomethylation in cancer cells. , 2009, Epigenomics.

[23]  Lee E. Edsall,et al.  Human DNA methylomes at base resolution show widespread epigenomic differences , 2009, Nature.

[24]  Tyson A. Clark,et al.  Direct detection of DNA methylation during single-molecule, real-time sequencing , 2010, Nature Methods.

[25]  J. Licht,et al.  Leukemic IDH1 and IDH2 mutations result in a hypermethylation phenotype, disrupt TET2 function, and impair hematopoietic differentiation. , 2010, Cancer cell.

[26]  David R. Liu,et al.  The Behaviour of 5-Hydroxymethylcytosine in Bisulfite Sequencing , 2010, PloS one.

[27]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[28]  Vijay K. Tiwari,et al.  DNA-binding factors shape the mouse methylome at distal regulatory regions , 2011, Nature.

[29]  Hong-Qiang Wang,et al.  SLIM: a sliding linear model for estimating the proportion of true null hypotheses in datasets with dependence structures , 2011, Bioinform..

[30]  Krishna R. Kalari,et al.  Integrated Analysis of Gene Expression, CpG Island Methylation, and Gene Copy Number in Breast Cancer Cells by Deep Sequencing , 2011, PloS one.

[31]  J. Stamatoyannopoulos,et al.  DNA methylation status predicts cell type‐specific enhancer activity , 2011, The EMBO journal.

[32]  A. Bird,et al.  CpG islands and the regulation of transcription. , 2011, Genes & development.

[33]  Felix Krueger,et al.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications , 2011, Bioinform..

[34]  B. Pulendran,et al.  4th Aegean Conference on The Crossroads between Innate and Adaptive Immunity , 2011, Nature Immunology.

[35]  A. Feinberg,et al.  Increased methylation variation in epigenetic domains across cancer types , 2011, Nature Genetics.

[36]  Francine E. Garrett-Bakelman,et al.  Base-Pair Resolution DNA Methylation Sequencing Reveals Profoundly Divergent Epigenetic Landscapes in Acute Myeloid Leukemia , 2012, PLoS genetics.

[37]  Thomas Lengauer,et al.  A DNA methylation fingerprint of 1628 human samples. , 2011, Genome research.

[38]  A. Franke,et al.  DNA methylome analysis using short bisulfite sequencing data , 2012, Nature Methods.

[39]  G. Hon,et al.  Base-Resolution Analysis of 5-Hydroxymethylcytosine in the Mammalian Genome , 2012, Cell.

[40]  W. Reik,et al.  Uncovering the role of 5-hydroxymethylcytosine in the epigenome , 2011, Nature Reviews Genetics.

[41]  Kiyoshi Asai,et al.  A mostly traditional approach improves alignment of bisulfite-converted DNA , 2012, Nucleic acids research.

[42]  Zachary D. Smith,et al.  A unique regulatory phase of DNA methylation in the early mammalian embryo , 2012, Nature.

[43]  S. Balasubramanian,et al.  Quantitative Sequencing of 5-Methylcytosine and 5-Hydroxymethylcytosine at Single-Base Resolution , 2012, Science.

[44]  Thomas Lengauer,et al.  BLUEPRINT to decode the epigenetic signature written in blood , 2012, Nature Biotechnology.