CTCF binding site classes exhibit distinct evolutionary, genomic, epigenomic and transcriptomic features

BackgroundCTCF (CCCTC-binding factor) is an evolutionarily conserved zinc finger protein involved in diverse functions ranging from negative regulation of MYC, to chromatin insulation of the beta-globin gene cluster, to imprinting of the Igf2 locus. The 11 zinc fingers of CTCF are known to differentially contribute to the CTCF-DNA interaction at different binding sites. It is possible that the differences in CTCF-DNA conformation at different binding sites underlie CTCF's functional diversity. If so, the CTCF binding sites may belong to distinct classes, each compatible with a specific functional role.ResultsWe have classified approximately 26,000 CTCF binding sites in CD4+ T cells into three classes based on their similarity to the well-characterized CTCF DNA-binding motif. We have comprehensively characterized these three classes of CTCF sites with respect to several evolutionary, genomic, epigenomic, transcriptomic and functional features. We find that the low-occupancy sites tend to be cell type specific. Furthermore, while the high-occupancy sites associate with repressive histone marks and greater gene co-expression within a CTCF-flanked block, the low-occupancy sites associate with active histone marks and higher gene expression. We found that the low-occupancy sites have greater conservation in their flanking regions compared to high-occupancy sites. Interestingly, based on a novel class-conservation metric, we observed that human low-occupancy sites tend to be conserved as low-occupancy sites in mouse (and vice versa) more frequently than expected.ConclusionsOur work reveals several key differences among CTCF occupancy-based classes and suggests a critical, yet distinct functional role played by low-occupancy sites.

[1]  E. Rubio,et al.  Thec-myc Insulator Element and Matrix Attachment Regions Definethe c-myc ChromosomalDomain , 2003, Molecular and Cellular Biology.

[2]  Shane T. Jensen,et al.  Cis-regulatory modules in the mammalian liver: composition depends on strength of Foxa2 consensus site , 2008, Nucleic acids research.

[3]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[4]  Z. Weng,et al.  The Insulator Binding Protein CTCF Positions 20 Nucleosomes around Its Binding Sites across the Human Genome , 2008, PLoS genetics.

[5]  N. Galjart,et al.  CTCF regulates cell cycle progression of αβ T cells in the thymus , 2008, The EMBO journal.

[6]  D. Dorsett,et al.  Cohesin and CTCF: cooperating to control chromosome conformation? , 2008, BioEssays : news and reviews in molecular, cellular and developmental biology.

[7]  L. Matthews,et al.  Conservation of the H19 noncoding RNA and H19-IGF2 imprinting mechanism in therians , 2008, Nature Genetics.

[8]  G. Felsenfeld,et al.  Critical DNA Binding Interactions of the Insulator Protein CTCF , 2007, Journal of Biological Chemistry.

[9]  R. Shamir,et al.  A global view of the selection forces in the evolution of yeast cis-regulation. , 2004, Genome research.

[10]  Mi Zhou,et al.  CTCFBSDB: a CTCF-binding site database for characterization of vertebrate genomic insulators , 2007, Nucleic Acids Res..

[11]  Michael Q. Zhang,et al.  Analysis of the Vertebrate Insulator Protein CTCF-Binding Sites in the Human Genome , 2007, Cell.

[12]  Michael Q. Zhang,et al.  Combinatorial patterns of histone acetylations and methylations in the human genome , 2008, Nature Genetics.

[13]  A. Orth,et al.  Large-scale analysis of the human and mouse transcriptomes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[14]  J. Graves,et al.  Construction and evolution of imprinted loci in mammals. , 2007, Trends in genetics : TIG.

[15]  David Haussler,et al.  Integration of the cytogenetic map with the draft human genome sequence. , 2003, Human molecular genetics.

[16]  T. Mikkelsen,et al.  Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites , 2007, Proceedings of the National Academy of Sciences.

[17]  P. Avner,et al.  An essential role for the DXPas34 tandem repeat and Tsix transcription in the counting process of X chromosome inactivation. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[18]  P. Neiman,et al.  An exceptionally conserved transcriptional repressor, CTCF, employs different combinations of zinc fingers to bind diverged promoter sequences of avian and mammalian c-myc oncogenes , 1996, Molecular and cellular biology.

[19]  A. Vostrov,et al.  Differential effect of zinc finger deletions on the binding of CTCF to the promoter of the amyloid precursor protein gene. , 2000, Nucleic acids research.

[20]  J. Graves,et al.  Recent Assembly of an Imprinted Domain from Non-Imprinted Components , 2006, PLoS genetics.

[21]  R. Weksberg,et al.  Insulator and silencer sequences in the imprinted region of human chromosome 11p15.5. , 2003, Human molecular genetics.

[22]  Paul Flicek,et al.  Functional diversity for REST (NRSF) is defined by in vivo binding affinity hierarchies at the DNA sequence level. , 2009, Genome research.

[23]  S. Hannenhalli,et al.  Maternal depletion of CTCF reveals multiple functions during oocyte and preimplantation embryo development , 2008, Development.

[24]  N. D. Clarke,et al.  Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem Cells , 2008, Cell.

[25]  Amos Tanay,et al.  Extensive low-affinity transcriptional interactions in the yeast genome. , 2006, Genome research.

[26]  A. Krumm,et al.  Targeted Deletion of Multiple CTCF-Binding Elements in the Human C-MYC Gene Reveals a Requirement for CTCF in C-MYC Expression , 2009, PloS one.

[27]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Victor V Lobanenkov,et al.  A novel sequence-specific DNA binding protein which interacts with three regularly spaced direct repeats of the CCCTC-motif in the 5'-flanking sequence of the chicken c-myc gene. , 1990, Oncogene.

[29]  M. Bartolomei,et al.  Transgenic RNAi Reveals Essential Function for CTCF in H19 Gene Imprinting , 2004, Science.

[30]  Alexander E. Kel,et al.  TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes , 2005, Nucleic Acids Res..

[31]  A. Vostrov,et al.  The zinc finger protein CTCF binds to the APBbeta domain of the amyloid beta-protein precursor promoter. Evidence for a role in transcriptional activation. , 1997, The Journal of biological chemistry.

[32]  R Ohlsson,et al.  CTCF is a uniquely versatile transcription regulator linked to epigenetics and disease. , 2001, Trends in genetics : TIG.

[33]  J. Zlatanova,et al.  CTCF and its protein partners: divide and rule? , 2009, Journal of Cell Science.

[34]  L. Wessels,et al.  Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions , 2008, Nature.

[35]  Dustin E. Schones,et al.  Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains. , 2008, Genome research.

[36]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[37]  L. Matthews,et al.  The Evolution of the DLK1-DIO3 Imprinted Domain in Mammals , 2008, PLoS biology.

[38]  Raja Jothi,et al.  Genome-wide identification of in vivo protein–DNA binding sites from ChIP-Seq data , 2008, Nucleic acids research.

[39]  V. Corces,et al.  CTCF: Master Weaver of the Genome , 2009, Cell.

[40]  Jeannie T. Lee,et al.  Identification of a Ctcf cofactor, Yy1, for the X chromosome binary switch. , 2007, Molecular cell.

[41]  Sridhar Hannenhalli,et al.  Transcriptional Genomics Associates FOX Transcription Factors With Human Heart Failure , 2006, Circulation.

[42]  M. Bartolomei,et al.  CTCF binding sites promote transcription initiation and prevent DNA methylation on the maternal allele at the imprinted H19/Igf2 locus. , 2006, Human molecular genetics.

[43]  Jeannie T. Lee,et al.  X chromosome dosage compensation: how mammals keep the balance. , 2008, Annual review of genetics.

[44]  M. Vigneron,et al.  CTCF Interacts with and Recruits the Largest Subunit of RNA Polymerase II to CTCF Target Sites Genome-Wide , 2007, Molecular and Cellular Biology.

[45]  Sridhar Hannenhalli,et al.  Enhanced position weight matrices using mixture models , 2005, ISMB.

[46]  B. Steensel,et al.  Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture–on-chip (4C) , 2006, Nature Genetics.

[47]  Dustin E. Schones,et al.  High-Resolution Profiling of Histone Methylations in the Human Genome , 2007, Cell.

[48]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[49]  B. Chadwick,et al.  The insulator factor CTCF controls MHC class II gene expression and is required for the formation of long-distance chromatin interactions , 2008, The Journal of experimental medicine.

[50]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[51]  J. D. Engel,et al.  Effects of altered gene order or orientation of the locus control region on human β-globin gene expression in mice , 1999, Nature.

[52]  K. Yamamoto,et al.  DNA Binding Site Sequence Directs Glucocorticoid Receptor Structure and Activity , 2009, Science.

[53]  Sridhar Hannenhalli,et al.  Identification of transcription factor binding sites in the human genome sequence , 2002, Mammalian Genome.