ChromaSig: A Probabilistic Approach to Finding Common Chromatin Signatures in the Human Genome

Computational methods to identify functional genomic elements using genetic information have been very successful in determining gene structure and in identifying a handful of cis-regulatory elements. But the vast majority of regulatory elements have yet to be discovered, and it has become increasingly apparent that their discovery will not come from using genetic information alone. Recently, high-throughput technologies have enabled the creation of information-rich epigenetic maps, most notably for histone modifications. However, tools that search for functional elements using this epigenetic information have been lacking. Here, we describe an unsupervised learning method called ChromaSig to find, in an unbiased fashion, commonly occurring chromatin signatures in both tiling microarray and sequencing data. Applying this algorithm to nine chromatin marks across a 1% sampling of the human genome in HeLa cells, we recover eight clusters of distinct chromatin signatures, five of which correspond to known patterns associated with transcriptional promoters and enhancers. Interestingly, we observe that the distinct chromatin signatures found at enhancers mark distinct functional classes of enhancers in terms of transcription factor and coactivator binding. In addition, we identify three clusters of novel chromatin signatures that contain evolutionarily conserved sequences and potential cis-regulatory elements. Applying ChromaSig to a panel of 21 chromatin marks mapped genomewide by ChIP-Seq reveals 16 classes of genomic elements marked by distinct chromatin signatures. Interestingly, four classes containing enrichment for repressive histone modifications appear to be locally heterochromatic sites and are enriched in quickly evolving regions of the genome. The utility of this approach in uncovering novel, functionally significant genomic elements will aid future efforts of genome annotation via chromatin modifications.

[1]  Emmitt R. Jolly,et al.  Inference of combinatorial regulation in yeast transcriptional networks: a case study of sporulation. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Suresh Cuddapah,et al.  The genomic landscape of histone modifications in human T cells , 2006, Proceedings of the National Academy of Sciences.

[3]  D. Cimini,et al.  Histone hyperacetylation in mitosis prevents sister chromatid separation and produces chromosome segregation defects. , 2003, Molecular biology of the cell.

[4]  Michael Grunstein,et al.  Genome-wide patterns of histone modifications in yeast , 2006, Nature Reviews Molecular Cell Biology.

[5]  Z. Weng,et al.  High-Resolution Mapping and Characterization of Open Chromatin across the Genome , 2008, Cell.

[6]  D. Reinberg,et al.  Histone H3 Lys 4 methylation: caught in a bind? , 2006, Genes & development.

[7]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[8]  Irene K. Moore,et al.  A genomic code for nucleosome positioning , 2006, Nature.

[9]  D. Botstein,et al.  Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF , 2001, Nature.

[10]  James A. Cuff,et al.  A Bivalent Chromatin Structure Marks Key Developmental Genes in Embryonic Stem Cells , 2006, Cell.

[11]  J. Neefjes,et al.  DNA damage triggers nucleotide excision repair-dependent monoubiquitylation of histone H2A. , 2006, Genes & development.

[12]  Ernest Fraenkel,et al.  High-resolution computational models of genome binding events , 2006, Nature Biotechnology.

[13]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[14]  Jane M J Lin,et al.  Identification and Characterization of Cell Type–Specific and Ubiquitous Chromatin Regulatory Structures in the Human Genome , 2007, PLoS genetics.

[15]  T. Wolfsberg,et al.  DNase-chip: a high-resolution method to identify DNase I hypersensitive sites using tiled microarrays , 2006, Nature Methods.

[16]  Leah Barrera,et al.  ChIP‐chip: Data, Model, and Analysis , 2007, Biometrics.

[17]  Z. Weng,et al.  A Global Map of p53 Transcription-Factor Binding Sites in the Human Genome , 2006, Cell.

[18]  C. Allis,et al.  Translating the Histone Code , 2001, Science.

[19]  N. Friedman,et al.  Single-Nucleosome Mapping of Histone Modifications in S. cerevisiae , 2005, PLoS biology.

[20]  F. Robert,et al.  Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression , 2006 .

[21]  Bing Ren,et al.  Direct isolation and identification of promoters in the human genome. , 2005, Genome research.

[22]  A. Mortazavi,et al.  Genome-Wide Mapping of in Vivo Protein-DNA Interactions , 2007, Science.

[23]  M. Vidal,et al.  Role of histone H2A ubiquitination in Polycomb silencing , 2004, Nature.

[24]  D. Sterner,et al.  Histone sumoylation is a negative regulator in Saccharomyces cerevisiae and shows dynamic interplay with positive-acting histone modifications. , 2006, Genes & development.

[25]  Leah Barrera,et al.  A high-resolution map of active promoters in the human genome , 2005, Nature.

[26]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[27]  Jonghwan Kim,et al.  Mapping DNA-protein interactions in large genomes by sequence tag analysis of genomic enrichment , 2005, Nature Methods.

[28]  Dustin E. Schones,et al.  High-Resolution Profiling of Histone Methylations in the Human Genome , 2007, Cell.

[29]  Obi L. Griffith,et al.  ORegAnno: an open access database and curation system for literature-derived promoters, transcription factor binding sites and regulatory variation , 2006, Bioinform..

[30]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[31]  Megan F. Cole,et al.  Genome-wide Map of Nucleosome Acetylation and Methylation in Yeast , 2005, Cell.

[32]  Clifford A. Meyer,et al.  Model-based analysis of tiling-arrays for ChIP-chip , 2006, Proceedings of the National Academy of Sciences.

[33]  P. Grant,et al.  A tale of histone modifications , 2001, Genome Biology.

[34]  Nathaniel D. Heintzman,et al.  Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome , 2007, Nature Genetics.