Exploratory visualizations and statistical analysis of large, heterogeneous epigenetic datasets

Epigenetic marks, such as DNA methylation and histone modifications, are important regulatory mechanisms that allow a single genomic sequence to give rise to a complex multicellular organism. When studying mechanisms of epigenetic regulation, the analyses depend on the experimental technologies and the available data. Recent advancements in sequencing technologies allow for the efficient extraction of genome-wide maps of epigenetic marks. A number of large-scale mapping projects, such as ENCODE and IHEC, intensively produce data for different tissues and cell cultures. The increasing quantity of data highlights a major bottleneck in bioinformatic research, namely the lack of bioinformatic tools for analyzing these data. To date, there are bioinformatics tools for detailed (mostly visual) inspection of single genomic loci, allowing biologists to focus research on regions of interest. Also, efficient tools for manipulation and analysis of the data have been published, but often they require computer science abilities. Furthermore, the available tools provide solutions to only already well formulated biological questions. What is missing, in our opinion, are tools (or pipelines of tools) to explore the data interactively, in a process that would facilitate a trained biologist to recognize interesting aspects and pursue them further until concrete hypotheses are formulated. A possible solution stems from the best practices in the fields of information retrieval and exploratory search. In this thesis, I propose EpiExplorer, a paradigm for integration of state-of-the-art information retrieval methods and indexing structures, applied to offer instant interactive exploration of large epigenetic datasets. The algorithms we use are developed for semi-structured text data, but we apply them on bioinformatic data through clever textual mapping of biological properties. We demonstrate the power of EpiExplorer in a series of studies that address interesting biological problems. We also present in this manuscript EpiGRAPH, a bioinformatic software that we developed with colleagues. EpiGRAPH helps identify and model significant biological associations among epigenetic and genetic properties for sets of regions. Using EpiExplorer and EpiGRAPH, independently or in a pipeline, provides the bioinformatic community with access to large databases of annotations, allows for exploratory visualizations or statistical analysis and facilitates reproduction and sharing of results.

[1]  Syed Haider,et al.  Ensembl BioMarts: a hub for data retrieval across taxonomic space , 2011, Database J. Biol. Databases Curation.

[2]  A. Aszódi,et al.  H3K27me3 forms BLOCs over silent genes and intergenic regions and specifies a histone banding pattern on a mouse autosomal chromosome. , 2009, Genome research.

[3]  Robert Gentleman,et al.  Statistical Analyses and Reproducible Research , 2007 .

[4]  Michael Q. Zhang,et al.  Bioinformatics Original Paper Predicting Methylation Status of Cpg Islands in the Human Brain , 2022 .

[5]  Michael B. Stadler,et al.  Lineage-specific polycomb targets and de novo DNA methylation define restriction and potential of neuronal progenitors. , 2008, Molecular cell.

[6]  Vijay K. Tiwari,et al.  DNA-binding factors shape the mouse methylome at distal regulatory regions , 2011, Nature.

[7]  Mark Greenwood,et al.  Taverna: lessons in creating a workflow environment for the life sciences: Research Articles , 2006 .

[8]  P. Scacheri,et al.  Epigenetic signatures distinguish multiple classes of enhancers with distinct cellular functions. , 2011, Genome research.

[9]  R. Wilson,et al.  Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma. , 2010, Cancer cell.

[10]  R. Levine,et al.  Mutation in TET2 in myeloid cancers. , 2009, The New England journal of medicine.

[11]  Peter Kraft,et al.  High concentrations of long interspersed nuclear element sequence distinguish monoallelically expressed genes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Brad T. Sherman,et al.  The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists , 2007, Genome Biology.

[13]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[14]  Kotagiri Ramamohanarao,et al.  Inverted files versus signature files for text indexing , 1998, TODS.

[15]  N. Heintz,et al.  The Nuclear DNA Base 5-Hydroxymethylcytosine Is Present in Purkinje Neurons and the Brain , 2009, Science.

[16]  David R. Liu,et al.  Conversion of 5-Methylcytosine to 5- Hydroxymethylcytosine in Mammalian DNA by the MLL Partner TET1 , 2009 .

[17]  C. Pipper,et al.  [''R"--project for statistical computing]. , 2008, Ugeskrift for laeger.

[18]  Oliver Clay,et al.  Evidence for erosion of mouse CpG islands during mammalian evolution , 1993, Somatic cell and molecular genetics.

[19]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[20]  D. Reinberg,et al.  Transcription regulation by histone methylation: interplay between different covalent modifications of the core histone tails. , 2001, Genes & development.

[21]  P. Laird Principles and challenges of genome-wide DNA methylation analysis , 2010, Nature Reviews Genetics.

[22]  Simon St. Laurent,et al.  Programming Web Services With XML-RPC , 2001 .

[23]  Thomas Lengauer,et al.  Web-based analysis of (Epi-) genome data using EpiGRAPH and Galaxy. , 2010, Methods in molecular biology.

[24]  Jingde Zhu,et al.  Whole-genome DNA methylation profiling using MethylCap-seq. , 2010, Methods.

[25]  M. Pellegrini,et al.  Genome-wide erasure of DNA methylation in mouse primordial germ cells is affected by AID deficiency , 2010, Nature.

[26]  David Haussler,et al.  The UCSC Genome Browser Database: 2008 update , 2007, Nucleic Acids Res..

[27]  Michael Q. Zhang,et al.  Computational prediction of methylation status in human genomic sequences. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[28]  W. J. Kent,et al.  The UCSC Genome Browser , 2003, Current protocols in bioinformatics.

[29]  Suhua Feng,et al.  5-Hydroxymethylcytosine is associated with enhancers and gene bodies in human embryonic stem cells , 2011, Genome Biology.

[30]  David Haussler,et al.  ENCODE whole-genome data in the UCSC genome browser (2011 update) , 2010, Nucleic Acids Res..

[31]  Yoshiyuki Sakaki,et al.  A comprehensive analysis of allelic methylation status of CpG islands on human chromosome 21q. , 2004, Genome research.

[32]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..

[33]  O. J. Dunn Multiple Comparisons among Means , 1961 .

[34]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[35]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer , 2011, Nature Biotechnology.

[36]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[37]  J. Rinn,et al.  A Large Intergenic Noncoding RNA Induced by p53 Mediates Global Gene Repression in the p53 Response , 2010, Cell.

[38]  Lars Feuerbach,et al.  Evolutionary epigenomics - identifying functional genome elements by epigenetic footprints in the DNA , 2014 .

[39]  Dirk Schübeler,et al.  Tackling the epigenome: challenges and opportunities for collaboration , 2010, Nature Biotechnology.

[40]  Jane Qiu,et al.  Epigenetics: Unfinished symphony , 2006, Nature.

[41]  L. E. McDonald,et al.  A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Chia-Lin Wei,et al.  Dynamic changes in the human methylome during differentiation. , 2010, Genome research.

[43]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[44]  Simon Kasif,et al.  Genomewide Analysis of PRC1 and PRC2 Occupancy Identifies Two Classes of Bivalent Domains , 2008, PLoS genetics.

[45]  W. Reik Stability and flexibility of epigenetic gene regulation in mammalian development , 2007, Nature.

[46]  E. Li,et al.  Establishment and Maintenance of Genomic Methylation Patterns in Mouse Embryonic Stem Cells by Dnmt3a and Dnmt3b , 2003, Molecular and Cellular Biology.

[47]  Fabien Campagne,et al.  DNA methylation signatures identify biologically distinct subtypes in acute myeloid leukemia. , 2010, Cancer cell.

[48]  John A. Miller,et al.  Java , 1977, Itinerario.

[49]  M. Frommer,et al.  CpG islands in vertebrate genomes. , 1987, Journal of molecular biology.

[50]  Daniel Tunkelang,et al.  Faceted Search , 2009, Synthesis Lectures on Information Concepts, Retrieval, and Services.

[51]  S. Henikoff,et al.  Regulation of nucleosome dynamics by histone modifications , 2013, Nature Structural &Molecular Biology.

[52]  W. Reik,et al.  Dynamic regulation of 5-hydroxymethylcytosine in mouse ES cells and during differentiation , 2011, Nature.

[53]  H. Leonhardt,et al.  A targeting sequence directs DNA methyltransferase to sites of DNA replication in mammalian nuclei , 1992, Cell.

[54]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[55]  Nathaniel D. Heintzman,et al.  Histone modifications at human enhancers reflect global cell-type-specific gene expression , 2009, Nature.

[56]  W. Reik,et al.  Genomic imprinting: parental influence on the genome , 2001, Nature Reviews Genetics.

[57]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[58]  Jill P. Mesirov,et al.  GSEA-P: a desktop application for Gene Set Enrichment Analysis , 2007, Bioinform..

[59]  Thomas Lengauer,et al.  CpG Island Mapping by Epigenome Prediction , 2007, PLoS Comput. Biol..

[60]  D. Haber,et al.  DNA Methyltransferases Dnmt3a and Dnmt3b Are Essential for De Novo Methylation and Mammalian Development , 1999, Cell.

[61]  Daniel J. Blankenberg,et al.  Galaxy: A Web‐Based Genome Analysis Tool for Experimentalists , 2010, Current protocols in molecular biology.

[62]  K. Döhner,et al.  TET genes: new players in DNA demethylation and important determinants for stemness. , 2011, Experimental hematology.

[63]  Yi Zhang,et al.  Active DNA demethylation: many roads lead to Rome , 2010, Nature Reviews Molecular Cell Biology.

[64]  J. Rogers,et al.  DNA methylation profiling of human chromosomes 6, 20 and 22 , 2006, Nature Genetics.

[65]  S. Orkin Globin gene regulation and switching: Circa 1990 , 1990, Cell.

[66]  Peter A. Jones,et al.  DNA methylation and cellular reprogramming. , 2010, Trends in cell biology.

[67]  Andrew J. Bannister,et al.  Regulation of chromatin by histone modifications , 2011, Cell Research.

[68]  Amos Tanay,et al.  Primate CpG Islands Are Maintained by Heterogeneous Evolutionary Regimes Involving Minimal Selection , 2011, Cell.

[69]  R. Jaenisch,et al.  Germ-line passage is required for establishment of methylation and expression patterns of imprinted but not of nonimprinted genes. , 1996, Genes & development.

[70]  A. Feinberg,et al.  The history of cancer epigenetics , 2004, Nature Reviews Cancer.

[71]  Michael Q. Zhang,et al.  Large-scale structure of genomic methylation patterns. , 2005, Genome research.

[72]  Bradley E. Bernstein,et al.  GC-Rich Sequence Elements Recruit PRC2 in Mammalian ES Cells , 2010, PLoS genetics.

[73]  Jim Stalker,et al.  A Novel CpG Island Set Identifies Tissue-Specific Methylation at Developmental Gene Loci , 2008, PLoS biology.

[74]  K. Gunderson,et al.  Genome-wide DNA methylation profiling using Infinium® assay. , 2009, Epigenomics.

[75]  Dustin E. Schones,et al.  High-Resolution Profiling of Histone Methylations in the Human Genome , 2007, Cell.

[76]  E. Birney,et al.  EnsMart: a generic system for fast and flexible access to biological data. , 2003, Genome research.

[77]  T. Mikkelsen,et al.  Genome-scale DNA methylation maps of pluripotent and differentiated cells , 2008, Nature.

[78]  H. Cedar,et al.  De novo DNA methylation promoted by G9a prevents reprogramming of embryonically silenced genes , 2008, Nature Structural &Molecular Biology.

[79]  John Coggeshall,et al.  The MySQL Database , 2009 .

[80]  Renato Paro,et al.  Silencing chromatin: comparing modes and mechanisms , 2011, Nature Reviews Genetics.

[81]  Jian-Bing Fan,et al.  Genome‐wide DNA methylation profiling , 2010, Wiley interdisciplinary reviews. Systems biology and medicine.

[82]  Gary D Bader,et al.  International network of cancer genome projects , 2010, Nature.

[83]  M. Esteller Cancer epigenomics: DNA methylomes and histone-modification maps , 2007, Nature Reviews Genetics.

[84]  G. Hon,et al.  Next-generation genomics: an integrative approach , 2010, Nature Reviews Genetics.

[85]  Simon C. Potter,et al.  An overview of Ensembl. , 2004, Genome research.

[86]  Zachary D. Smith,et al.  Genome-scale DNA methylation mapping of clinical samples at single-nucleotide resolution , 2010, Nature Methods.

[87]  Michael Q. Zhang,et al.  Large-scale human promoter mapping using CpG islands , 2000, Nature Genetics.

[88]  Thomas Lengauer,et al.  Inter-individual variation of DNA methylation and its implications for large-scale epigenome mapping , 2008, Nucleic acids research.

[89]  Raymond K. Auerbach,et al.  A User's Guide to the Encyclopedia of DNA Elements (ENCODE) , 2011, PLoS biology.

[90]  Michael Krawczak,et al.  Translocation and gross deletion breakpoints in human inherited disease and cancer I: Nucleotide composition and recombination‐associated motifs , 2003, Human mutation.

[91]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[92]  Sridhar Hannenhalli,et al.  Selection of Target Sites for Mobile DNA Integration in the Human Genome , 2006, PLoS Comput. Biol..

[93]  Adrian Bird,et al.  Perceptions of epigenetics , 2007, Nature.

[94]  Thomas Lengauer,et al.  EpiExplorer: live exploration and global analysis of large epigenomic datasets , 2012, Genome Biology.

[95]  R. Fisher On the Interpretation of χ 2 from Contingency Tables , and the Calculation of P Author , 2022 .

[96]  Hiroki Nagase,et al.  Association of tissue-specific differentially methylated regions (TDMs) with differential gene expression. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[97]  J. Arand,et al.  Epigenetic Reprogramming in Mammalian Development , 2012 .

[98]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[99]  Jill P. Mesirov,et al.  GenomeSpace: an environment for frictionless bioinformatics , 2013 .

[100]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[101]  A. Visel,et al.  ChIP-seq accurately predicts tissue-specific activity of enhancers , 2009, Nature.

[102]  Guido van Rossum,et al.  Python Programming Language , 2007, USENIX Annual Technical Conference.

[103]  Andreas Prlic,et al.  Ensembl 2008 , 2007, Nucleic Acids Res..

[104]  Jason A Greenbaum,et al.  Construction of a genome-scale structural map at single-nucleotide resolution. , 2007, Genome research.

[105]  A. Shilatifard,et al.  An operational definition of epigenetics. , 2009, Genes & development.

[106]  A. Bird,et al.  CpG islands and the regulation of transcription. , 2011, Genes & development.

[107]  E. Seto,et al.  Histone modifications. , 2003, Methods.

[108]  Philippe Collas,et al.  Chop it, ChIP it, check it: the current status of chromatin immunoprecipitation. , 2008, Frontiers in bioscience : a journal and virtual library.

[109]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[110]  J. Herman,et al.  Genomic and Epigenomic Integration Identifies a Prognostic Signature in Colon Cancer , 2011, Clinical Cancer Research.

[111]  Philip Cayting,et al.  An encyclopedia of mouse DNA elements (Mouse ENCODE) , 2012, Genome Biology.

[112]  Robert S. Illingworth,et al.  Orphan CpG Islands Identify Numerous Conserved Promoters in the Mammalian Genome , 2010, PLoS genetics.

[113]  이상훈,et al.  트위터 트랜딩 토픽을 이용한 HBase 기반 자동 요약 시스템 , 2014 .

[114]  真田 昌 骨髄異形成症候群のgenome-wide analysis , 2013 .

[115]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[116]  Cory Y. McLean,et al.  GREAT improves functional interpretation of cis-regulatory regions , 2010, Nature Biotechnology.

[117]  Israel Steinfeld,et al.  Developmental programming of CpG island methylation profiles in the human genome , 2009, Nature Structural &Molecular Biology.

[118]  Thomas Lengauer,et al.  CpG Island Methylation in Human Lymphocytes Is Highly Correlated with DNA Sequence, Repeats, and Predicted DNA Structure , 2006, PLoS genetics.

[119]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[120]  D. Sterner,et al.  Acetylation of Histones and Transcription-Related Factors , 2000, Microbiology and Molecular Biology Reviews.

[121]  Dikshant Shahi Apache Solr , 2015, Apress.

[122]  B. Ren,et al.  Integrating 5-Hydroxymethylcytosine into the Epigenomic Landscape of Human Embryonic Stem Cells , 2011, PLoS genetics.

[123]  Thomas Lengauer,et al.  Computational epigenetics , 2008, Bioinform..

[124]  Sin Lam Tan,et al.  Mice and Men: Their Promoter Properties , 2006, PLoS genetics.

[125]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[126]  Peter Willett,et al.  Sequence-dependent DNA structure: a database of octamer structural parameters. , 2003, Journal of molecular biology.

[127]  Zachary D. Smith,et al.  DNA methylation: roles in mammalian development , 2013, Nature Reviews Genetics.

[128]  A. Bird DNA methylation patterns and epigenetic memory. , 2002, Genes & development.

[129]  R Holliday,et al.  DNA methylation and mutation. , 1993, Mutation research.

[130]  T. Chevassut,et al.  Severe Global DNA Hypomethylation Blocks Differentiation and Induces Histone Hyperacetylation in Embryonic Stem Cells , 2004, Molecular and Cellular Biology.

[131]  Thomas Lengauer,et al.  A method for finding consensus breakpoints in the cancer genome from copy number data , 2013, Bioinform..

[132]  Michael J. Ziller,et al.  Reference Maps of Human ES and iPS Cell Variation Enable High-Throughput Characterization of Pluripotent Cell Lines , 2011, Cell.

[133]  Martin S. Taylor,et al.  Genome-wide analysis of mammalian promoter architecture and evolution , 2006, Nature Genetics.

[134]  Howard Cedar,et al.  DNA methylation affects the formation of active chromatin , 1986, Cell.

[135]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[136]  R. Tjian,et al.  Transcriptional regulation in mammalian cells by sequence-specific DNA binding proteins. , 1989, Science.

[137]  T. Kouzarides Chromatin Modifications and Their Function , 2007, Cell.

[138]  Marti A. Hearst Search User Interfaces , 2009 .

[139]  S. Schuster Next-generation sequencing transforms today's biology , 2008, Nature Methods.

[140]  W. Lam,et al.  Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells , 2005, Nature Genetics.

[141]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[142]  T. Mikkelsen,et al.  Genome-wide maps of chromatin state in pluripotent and lineage-committed cells , 2007, Nature.

[143]  Allen D. Delaney,et al.  Conserved Role of Intragenic DNA Methylation in Regulating Alternative Promoters , 2010, Nature.

[144]  Ingmar Weber,et al.  Type less, find more: fast autocompletion search with a succinct index , 2006, SIGIR.

[145]  江枫 Oracle XML DB的发展历程 , 2007 .

[146]  Giorgio Bernardi,et al.  An isochore map of human chromosomes. , 2006, Genome research.

[147]  Natalie Jäger,et al.  Genome-wide mapping of DNA methylation: a quantitative technology comparison , 2010, Nature Biotechnology.

[148]  David R. Liu,et al.  The Behaviour of 5-Hydroxymethylcytosine in Bisulfite Sequencing , 2010, PloS one.

[149]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[150]  Michael B. Stadler,et al.  Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome , 2007, Nature Genetics.

[151]  Ingmar Weber,et al.  The CompleteSearch Engine: Interactive, Efficient, and Towards IR& DB Integration , 2007, CIDR.

[152]  Keji Zhao,et al.  Dual functions of Tet1 in transcriptional regulation in mouse embryonic stem cells , 2011, Nature.

[153]  A. Bird,et al.  Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals , 2003, Nature Genetics.

[154]  Peter A. Jones,et al.  Rethinking how DNA methylation patterns are maintained , 2009, Nature Reviews Genetics.

[155]  Timothy E. Reddy,et al.  Distinct DNA methylation patterns characterize differentiated human embryonic stem cells and developing human fetal liver. , 2009, Genome research.

[156]  A. Riggs,et al.  Epigenetic mechanisms of gene regulation , 1996 .

[157]  Joachim Büch,et al.  EpiGRAPH: user-friendly software for statistical analysis and prediction of (epi)genomic data , 2009, Genome Biology.

[158]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[159]  Phillips Jm,et al.  Aberration in DNA methylation in B-cell lymphomas has a complex origin and increases with disease severity , 2022 .

[160]  Elias Campo Guerri,et al.  International network of cancer genome projects , 2010 .

[161]  David Haussler,et al.  The UCSC Genome Browser database: update 2010 , 2009, Nucleic Acids Res..

[162]  Leng Han,et al.  Features and trend of loss of promoter-associated CpG islands in the human and mouse genomes. , 2007, Molecular biology and evolution.

[163]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[164]  S. Andrews,et al.  Dynamic CpG island methylation landscape in oocytes and preimplantation embryos , 2011, Nature Genetics.

[165]  Yi Zhang,et al.  Role of Tet proteins in 5mC to 5hmC conversion, ES-cell self-renewal and inner cell mass specification , 2010, Nature.

[166]  Ting Wang,et al.  ENCODE whole-genome data in the UCSC Genome Browser , 2009, Nucleic Acids Res..

[167]  Thomas Lengauer,et al.  Analyzing epigenome data in context of genome evolution and human diseases. , 2012, Methods in molecular biology.

[168]  E. Lander,et al.  The Mammalian Epigenome , 2007, Cell.

[169]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[170]  G. K. Sandve,et al.  The Genomic HyperBrowser: inferential genomics at the sequence level , 2010, Genome Biology.

[171]  Rudolf Jaenisch,et al.  Targeted mutation of the DNA methyltransferase gene results in embryonic lethality , 1992, Cell.

[172]  Olle Melander,et al.  From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus , 2010, Nature.

[173]  Daiya Takai,et al.  Comprehensive analysis of CpG islands in human chromosomes 21 and 22 , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[174]  David Haussler,et al.  The Human Epigenome Browser at Washington University , 2011, Nature Methods.

[175]  E. Birney,et al.  An integrated resource for genome-wide identification and analysis of human tissue-specific differentially methylated regions (tDMRs). , 2008, Genome research.

[176]  Richard S. Sandstrom,et al.  BEDOPS: high-performance genomic feature operations , 2012, Bioinform..

[177]  Howard Cedar,et al.  Programming of DNA methylation patterns. , 2012, Annual review of biochemistry.

[178]  A. Bird CpG-rich islands and the function of DNA methylation , 1986, Nature.

[179]  T. Mikkelsen,et al.  The NIH Roadmap Epigenomics Mapping Consortium , 2010, Nature Biotechnology.

[180]  Gautier Koscielny,et al.  Ensembl’s 10th year , 2009, Nucleic Acids Res..

[181]  G. Almouzni,et al.  Prime, repair, restore: the active role of chromatin in the DNA damage response. , 2012, Molecular cell.

[182]  B. van Steensel,et al.  Mapping of genetic and epigenetic regulatory networks using microarrays. , 2005, Nature genetics.

[183]  Thomas Lengauer,et al.  BLUEPRINT to decode the epigenetic signature written in blood , 2012, Nature Biotechnology.

[184]  Yoshiyuki Sakaki,et al.  A comprehensive analysis of allelic methylation status of CpG islands on human chromosome 11q: Comparison with chromosome 21q , 2006, DNA sequence : the journal of DNA sequencing and mapping.

[185]  C. Bock Analysing and interpreting DNA methylation data , 2012, Nature Reviews Genetics.

[186]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[187]  Gary Marchionini,et al.  Exploratory search , 2006, Commun. ACM.

[188]  Thomas Lengauer,et al.  EpiGRAPHregression: A toolkit for (epi-)genomic correlation analysis and prediction of quantitative attributes , 2006 .

[189]  Rudolf Jaenisch,et al.  Role for DNA methylation in genomic imprinting , 1993, Nature.

[190]  Eric S. Lander,et al.  Comparative Epigenomic Analysis of Murine and Human Adipogenesis , 2010, Cell.