MeDeCom: discovery and quantification of latent components of heterogeneous methylomes

It is important for large-scale epigenomic studies to determine and explore the nature of hidden confounding variation, most importantly cell composition. We developed MeDeCom as a novel reference-free computational framework that allows the decomposition of complex DNA methylomes into latent methylation components and their proportions in each sample. MeDeCom is based on constrained non-negative matrix factorization with a new biologically motivated regularization function. It accurately recovers cell-type-specific latent methylation components and their proportions. MeDeCom is a new unsupervised tool for the exploratory study of the major sources of methylation variation, which should lead to a deeper understanding and better biological interpretation.

[1]  E. Andres Houseman,et al.  Reference-free cell mixture adjustments in analysis of DNA methylation data , 2014, Bioinform..

[2]  Devin C. Koestler,et al.  DNA methylation arrays as surrogate measures of cell mixture distribution , 2012, BMC Bioinformatics.

[3]  S. Horvath DNA methylation age of human tissues and cell types , 2013, Genome Biology.

[4]  Thomas A. Down,et al.  Identification of Type 1 Diabetes–Associated DNA Methylation Variable Positions That Precede Disease Diagnosis , 2010, PLoS genetics.

[5]  D. Bennett,et al.  Methylomic profiling implicates cortical deregulation of ANK1 in Alzheimer's disease , 2014, Nature Neuroscience.

[6]  John K Wiencke,et al.  Quantitative reconstruction of leukocyte subsets using DNA methylation , 2013, Genome Biology.

[7]  J. Fahey,et al.  Distinct categories of immunologic changes in frail elderly , 2000, Mechanisms of Ageing and Development.

[8]  Patrick O. Perry,et al.  Bi-cross-validation of the SVD and the nonnegative matrix factorization , 2009, 0908.2062.

[9]  Bo Mattiasson,et al.  Methods in cell separations. , 2007, Advances in biochemical engineering/biotechnology.

[10]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[11]  Paolo Vineis,et al.  A Molecular Epidemiology Project on Diet and Cancer: The Epic-Italy Prospective Study. Design and Baseline Characteristics of Participants , 2003, Tumori.

[12]  Seungjin Choi Blind Source Separation and Independent Component Analysis : A Review , 2004 .

[13]  T. Hansen,et al.  IGF2 mRNA-binding protein 2: biological function and putative role in type 2 diabetes. , 2009, Journal of molecular endocrinology.

[14]  Ronald P. Schuyler,et al.  Whole-genome fingerprint of the DNA methylome during human B cell differentiation , 2015, Nature Genetics.

[15]  Alexandra M. Binder,et al.  Recommendations for the design and analysis of epigenome-wide association studies , 2013, Nature Methods.

[16]  Zdenka Pausova,et al.  Cigarette smoking and DNA methylation , 2013, Front. Genet..

[17]  Margaret R Karagas,et al.  Blood-based profiles of DNA methylation predict the underlying distribution of cell types , 2013, Epigenetics.

[18]  Anatoli I Yashin,et al.  Age related changes in population of peripheral T cells: towards a model of immunosenescence , 2003, Mechanisms of Ageing and Development.

[19]  A. Feinberg,et al.  Measuring cell-type specific differential methylation in human brain tissue , 2013, Genome Biology.

[20]  D. Schübeler Function and information content of DNA methylation , 2015, Nature.

[21]  Tyson A. Clark,et al.  Genome-wide mapping of methylated adenine residues in pathogenic Escherichia coli using single-molecule real-time sequencing , 2012, Nature Biotechnology.

[22]  N. Karpova Epigenetic Methods in Neuroscience Research , 2016, Neuromethods.

[23]  T. P. Dinh,et al.  Convex analysis approach to d.c. programming: Theory, Algorithm and Applications , 1997 .

[24]  R. Shoemaker,et al.  Allele-specific methylation is prevalent and is contributed by CpG-SNPs in the human genome. , 2010, Genome research.

[25]  Eldon Emberly,et al.  Factors underlying variable DNA methylation in a human community cohort , 2012, Proceedings of the National Academy of Sciences.

[26]  J. Kere,et al.  Differential DNA Methylation in Purified Human Blood Cells: Implications for Cell Lineage and Studies on Disease Susceptibility , 2012, PloS one.

[27]  W. Reik,et al.  Epigenetic Reprogramming in Mammalian Development , 2001, Science.

[28]  Stephen A. Vavasis,et al.  On the Complexity of Nonnegative Matrix Factorization , 2007, SIAM J. Optim..

[29]  Gastone Castellani,et al.  CD45 isoforms expression on CD4+ and CD8+ T cells throughout life, from newborns to centenarians: implications for T cell memory , 1996, Mechanisms of Ageing and Development.

[30]  A. Tanay,et al.  Single-cell epigenomics: techniques and emerging applications , 2015, Nature Reviews Genetics.

[31]  Martin J. Aryee,et al.  A cell epigenotype specific model for the correction of brain cellular heterogeneity bias and its application to age, brain region and major depression , 2013, Epigenetics.

[32]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[33]  Jennifer Kirkham,et al.  Cell separation: Terminology and practical considerations , 2012, Journal of tissue engineering.

[34]  Stefano Lucidi,et al.  A Derivative-Free Algorithm for Inequality Constrained Nonlinear Programming via Smoothing of an linfty Penalty Function , 2009, SIAM J. Optim..

[35]  E. Andres Houseman,et al.  Reference-free deconvolution of DNA methylation data and mediation by cell composition effects , 2016, BMC Bioinformatics.

[36]  V. Rakyan,et al.  Correcting for cell-type composition bias in epigenome-wide association studies , 2014, Genome Medicine.

[37]  S. Baylin,et al.  DNA methylation and gene silencing in cancer , 2005, Nature Clinical Practice Oncology.

[38]  U. Wüllner,et al.  Genome-scale methylation analysis of Parkinson's disease patients' brains reveals DNA hypomethylation and increased mRNA expression of cytochrome P450 2E1 , 2012, neurogenetics.

[39]  Axel Schumacher,et al.  A high-throughput DNA methylation analysis of a single cell , 2011, Nucleic acids research.

[40]  Cory Y. McLean,et al.  GREAT improves functional interpretation of cis-regulatory regions , 2010, Nature Biotechnology.

[41]  I. Amit,et al.  Transcriptional Heterogeneity and Lineage Commitment in Myeloid Progenitors , 2015, Cell.

[42]  Ruth Pidsley,et al.  A data-driven approach to preprocessing Illumina 450K methylation array data , 2013, BMC Genomics.

[43]  Tyson A. Clark,et al.  Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases , 2013, Genome research.

[44]  M. Esteller Cancer epigenomics: DNA methylomes and histone-modification maps , 2007, Nature Reviews Genetics.

[45]  R. Irizarry,et al.  Accounting for cellular heterogeneity is critical in epigenome-wide association studies , 2014, Genome Biology.

[46]  Martin J. Aryee,et al.  Epigenome-wide association studies without the need for cell-type composition , 2014, Nature Methods.

[47]  Rafael A. Irizarry,et al.  Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays , 2014, Bioinform..

[48]  John D. Storey,et al.  Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis , 2007, PLoS genetics.

[49]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[50]  Ashok Kumar,et al.  Methods in cell separation for biomedical application: cryogels as a new tool , 2008, Biomedical materials.

[51]  M. Pelizzola,et al.  The DNA methylome , 2011, FEBS letters.

[52]  Andrew E. Teschendorff,et al.  Independent surrogate variable analysis to deconvolve confounding factors in large-scale microarray profiling studies , 2011, Bioinform..

[53]  Soo-Young Lee Blind Source Separation and Independent Component Analysis: A Review , 2005 .

[54]  Irving L. Weissman,et al.  A comprehensive methylome map of lineage commitment from hematopoietic progenitors , 2010, Nature.

[55]  Lijun Cheng,et al.  Genetic control of individual differences in gene-specific methylation in human brain. , 2010, American journal of human genetics.

[56]  Vilmundur Gudnason,et al.  Heterogeneity in White Blood Cells Has Potential to Confound DNA Methylation Measurements , 2012, PloS one.

[57]  Wenyi Wang,et al.  Identification of rare DNA variants in mitochondrial disorders with improved array-based sequencing , 2010, Nucleic Acids Res..

[58]  Terrence J. Sejnowski,et al.  Epigenomic Signatures of Neuronal Diversity in the Mammalian Brain , 2015, Neuron.

[59]  Thomas Lengauer,et al.  Comprehensive Analysis of DNA Methylation Data with RnBeads , 2014, Nature Methods.

[60]  John K. Wiencke,et al.  Cell-composition effects in the analysis of DNA methylation array data: a mathematical perspective , 2015, BMC Bioinformatics.

[61]  Eran Halperin,et al.  Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies , 2016, Nature Methods.

[62]  Sven Olek,et al.  DNA Methylation Analysis as a Tool for Cell Typing , 2006, Epigenetics.

[63]  Neil Hall,et al.  After the gold rush , 2013, Genome Biology.

[64]  Martin J. Aryee,et al.  Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in Rheumatoid Arthritis , 2013, Nature Biotechnology.

[65]  T. Mikkelsen,et al.  The NIH Roadmap Epigenomics Mapping Consortium , 2010, Nature Biotechnology.

[66]  Hailin Tang,et al.  MiR-185 Targets the DNA Methyltransferases 1 and Regulates Global DNA Methylation in human glioma , 2011, Molecular Cancer.