Identification and Characterization of Cell Type–Specific and Ubiquitous Chromatin Regulatory Structures in the Human Genome

The identification of regulatory elements from different cell types is necessary for understanding the mechanisms controlling cell type–specific and housekeeping gene expression. Mapping DNaseI hypersensitive (HS) sites is an accurate method for identifying the location of functional regulatory elements. We used a high throughput method called DNase-chip to identify 3,904 DNaseI HS sites from six cell types across 1% of the human genome. A significant number (22%) of DNaseI HS sites from each cell type are ubiquitously present among all cell types studied. Surprisingly, nearly all of these ubiquitous DNaseI HS sites correspond to either promoters or insulator elements: 86% of them are located near annotated transcription start sites and 10% are bound by CTCF, a protein with known enhancer-blocking insulator activity. We also identified a large number of DNaseI HS sites that are cell type specific (only present in one cell type); these regions are enriched for enhancer elements and correlate with cell type–specific gene expression as well as cell type–specific histone modifications. Finally, we found that approximately 8% of the genome overlaps a DNaseI HS site in at least one the six cell lines studied, indicating that a significant percentage of the genome is potentially functional.

[1]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[2]  Peter Ralph,et al.  Properties of the K562 cell line, derived from a patient with chronic myeloid leukemia , 1976, International journal of cancer.

[3]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[4]  D. Tuan,et al.  Mapping of DNase I-hypersensitive sites in the upstream DNA of human embryonic epsilon-globin gene in K562 leukemia cells. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[5]  W. C. Forrester,et al.  Evidence for a locus activation region: the formation of developmentally stable hypersensitive sites in globin-expressing hybrids. , 1987, Nucleic acids research.

[6]  D. S. Gross,et al.  Nuclease hypersensitive sites in chromatin. , 1988, Annual review of biochemistry.

[7]  Michael E. Greenberg,et al.  c-Jun dimerizes with itself and with c-Fos, forming complexes of different DNA binding affinities , 1988, Cell.

[8]  D. Tuan,et al.  An erythroid-specific, developmental-stage-independent enhancer far upstream of the human "beta-like globin" genes. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[9]  S. Orkin,et al.  Erythroid differentiation in chimaeric mice blocked by a targeted mutation in the gene for transcription factor GATA-1 , 1991, Nature.

[10]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[11]  W. Schumann,et al.  CIRCE, a novel heat shock element involved in regulation of heat shock operon dnaK of Bacillus subtilis , 1994, Journal of bacteriology.

[12]  M. Hecker,et al.  Heat‐shock and general stress response in Bacillus subtilis , 1996, Molecular microbiology.

[13]  F. Vogensen,et al.  Analysis of heat shock gene expression in Lactococcus lactis MG1363. , 1996, Microbiology.

[14]  Holger Karas,et al.  TRANSFAC: a database on transcription factors and their DNA binding sites , 1996, Nucleic Acids Res..

[15]  T. Rabbitts,et al.  The LIM‐only protein Lmo2 is a bridging molecule assembling an erythroid, DNA‐binding complex which includes the TAL1, E47, GATA‐1 and Ldb1/NLI proteins , 1997, The EMBO journal.

[16]  G. Felsenfeld,et al.  Characterization of the chicken beta-globin insulator. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[17]  F. Rösl,et al.  Antioxidant-induced changes of the AP-1 transcription complex are paralleled by a selective suppression of human papillomavirus transcription , 1997, Journal of virology.

[18]  C. Begley,et al.  The SCL/TAL1 gene: Roles in normal and malignant haematopoiesis , 1997, BioEssays : news and reviews in molecular, cellular and developmental biology.

[19]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[20]  H. Schöler,et al.  In line with our ancestors: Oct‐4 and the mammalian germ , 1998, BioEssays : news and reviews in molecular, cellular and developmental biology.

[21]  A. van Loon,et al.  Regulation of Riboflavin Biosynthesis inBacillus subtilis Is Affected by the Activity of the Flavokinase/Flavin Adenine Dinucleotide Synthetase Encoded byribC , 1998, Journal of bacteriology.

[22]  J. Thomson,et al.  Embryonic stem cell lines derived from human blastocysts. , 1998, Science.

[23]  Michael Gribskov,et al.  Combining evidence using p-values: application to sequence homology searches , 1998, Bioinform..

[24]  T. Henkin,et al.  The S box regulon: a new global transcription termination control system for methionine and cysteine biosynthesis genes in Gram‐positive bacteria , 1998, Molecular microbiology.

[25]  D. Downs,et al.  thiBPQ Encodes an ABC Transporter Required for Transport of Thiamine and Thiamine Pyrophosphate inSalmonella typhimurium * , 1998, The Journal of Biological Chemistry.

[26]  P. D’Eustachio,et al.  Essential role of STAT3 for embryonic stem cell pluripotency. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[27]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[28]  A. West,et al.  The Protein CTCF Is Required for the Enhancer Blocking Activity of Vertebrate Insulators , 1999, Cell.

[29]  R. Tjian,et al.  Orchestrated response: a symphony of transcription factors for gene control. , 2000, Genes & development.

[30]  V. Orlando,et al.  Mapping chromosomal proteins in vivo by formaldehyde-crosslinked-chromatin immunoprecipitation. , 2000, Trends in biochemical sciences.

[31]  G. Church,et al.  Conservation of DNA regulatory motifs and discovery of new motifs in microbial genomes. , 2000, Genome research.

[32]  A. West,et al.  Structural and functional conservation at the boundaries of the chicken beta-globin domain. , 2000, The EMBO journal.

[33]  A. West,et al.  Structural and functional conservation at the boundaries of the chicken β‐globin domain , 2000 .

[34]  R. Goodman,et al.  CBP/p300 in cell growth, transformation, and development. , 2000, Genes & development.

[35]  G. Felsenfeld,et al.  Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene , 2000, Nature.

[36]  S. Salzberg,et al.  Prediction of transcription terminators in bacterial genomes. , 2000, Journal of molecular biology.

[37]  S. Saha,et al.  RNA Expression Analysis Using an AntisenseBacillus subtilis Genome Array , 2001, Journal of bacteriology.

[38]  Michael Y. Galperin,et al.  The COG database: new developments in phylogenetic classification of proteins from complete genomes , 2001, Nucleic Acids Res..

[39]  Julio Collado-Vides,et al.  A powerful non-homology method for the prediction of operons in prokaryotes , 2002, ISMB.

[40]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[41]  Min Cao,et al.  Defining the Bacillus subtilis sigma(W) regulon: a comparative analysis of promoter consensus search, run-off transcription/macroarray analysis (ROMA), and transcriptional profiling approaches. , 2002, Journal of molecular biology.

[42]  G. Orphanides,et al.  A Unified Theory of Gene Expression , 2002, Cell.

[43]  M. Gelfand,et al.  Comparative Genomics of Thiamin Biosynthesis in Procaryotes , 2002, The Journal of Biological Chemistry.

[44]  E. van Nimwegen,et al.  Probabilistic clustering of sequences: Inferring new bacterial regulons by comparative genomics , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[45]  C. Lawrence,et al.  Factors influencing the identification of transcription factor binding sites by cross-species comparison. , 2002, Genome research.

[46]  D. Patel,et al.  RNA-structural Mimicry in Escherichia coli Ribosomal Protein L4-dependent Regulation of the S10 Operon* , 2003, Journal of Biological Chemistry.

[47]  M. Kleerebezem,et al.  Complete genome sequence of Lactobacillus plantarum WCFS1 , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Eric D. Siggia,et al.  Genome wide identification of regulatory motifs in Bacillus subtilis , 2003, BMC Bioinformatics.

[49]  T. Conway,et al.  Microarray expression profiling: capturing a genome‐wide portrait of the transcriptome , 2003, Molecular microbiology.

[50]  Natalia Ivanova,et al.  The ERGOTM genome analysis and discovery system , 2003, Nucleic Acids Res..

[51]  J. Rodríguez-León,et al.  Analysis of the molecular cascade responsible for mesodermal limb chondrogenesis: Sox genes and BMP signaling. , 2003, Developmental biology.

[52]  T. Henkin,et al.  Transcription termination control of the S box system: Direct measurement of S-adenosylmethionine by the leader RNA , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[53]  T. Speed,et al.  Summaries of Affymetrix GeneChip probe level data. , 2003, Nucleic acids research.

[54]  N. Grishin,et al.  COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. , 2003, Journal of molecular biology.

[55]  Eric C. Rouchka,et al.  Gibbs Recursive Sampler: finding transcription factor binding sites , 2003, Nucleic Acids Res..

[56]  T. Henkin,et al.  The T box and S box transcription termination control systems. , 2003, Frontiers in bioscience : a journal and virtual library.

[57]  Berend Snel,et al.  Gene co-regulation is highly conserved in the evolution of eukaryotes and prokaryotes. , 2004, Nucleic acids research.

[58]  C. Lim,et al.  Purification and characterization of CopR, a transcriptional activator protein that binds to a conserved domain (cop box) in copper- inducible promoters of Pseudomonas syringae , 1994, Molecular and General Genetics MGG.

[59]  E. Zoetendal,et al.  The Intestinal LABs , 2002, Antonie van Leeuwenhoek.

[60]  M. Gelfand,et al.  Comparative genomics of the methionine metabolism in Gram-positive bacteria: a variety of regulatory systems. , 2004, Nucleic acids research.

[61]  W. Wasserman,et al.  Regulog analysis: detection of conserved regulatory networks across bacteria: application to Staphylococcus aureus. , 2004, Genome research.

[62]  Kenta Nakai,et al.  BTBS: database of transcriptional regulation in Bacillus subtilis and its contribution to comparative genomics , 2004, Nucleic Acids Res..

[63]  Julia Krushkal,et al.  Computational prediction of conserved operons and phylogenetic footprinting of transcription regulatory elements in the metal-reducing bacterial family Geobacteraceae. , 2004, Journal of theoretical biology.

[64]  Z. Weng,et al.  Detection of functional DNA motifs via statistical over-representation. , 2004, Nucleic acids research.

[65]  B. E. Davidson,et al.  Genomic organization of lactic acid bacteria , 1996, Antonie van Leeuwenhoek.

[66]  Ricardo Ciria,et al.  Conserved regulatory motifs in bacteria: riboswitches and beyond. , 2004, Trends in genetics : TIG.

[67]  Wyeth W. Wasserman,et al.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles , 2004, Nucleic Acids Res..

[68]  M. Kleerebezem,et al.  The complete genomes of Lactobacillus plantarum and Lactobacillus johnsonii reveal extensive differences in chromosome organization and gene content. , 2004, Microbiology.

[69]  Leah Barrera,et al.  A high-resolution map of active promoters in the human genome , 2005, Nature.

[70]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[71]  Michael Q. Zhang,et al.  DNA motifs in human and mouse proximal promoters predict tissue-specific expression. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[72]  T. Wolfsberg,et al.  DNase-chip: a high-resolution method to identify DNase I hypersensitive sites using tiled microarrays , 2006, Nature Methods.

[73]  C. Nusbaum,et al.  Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. , 2006, Genome research.

[74]  M. Daly,et al.  Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). , 2005, Genome research.

[75]  Michael Q. Zhang,et al.  Analysis of the Vertebrate Insulator Protein CTCF-Binding Sites in the Human Genome , 2007, Cell.

[76]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[77]  Nathaniel D. Heintzman,et al.  Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome , 2007, Nature Genetics.

[78]  Shane C. Dillon,et al.  The landscape of histone modifications across 1% of the human genome in five human cell lines. , 2007, Genome research.