Impact of cytosine methylation on DNA binding specificities of human transcription factors

Positives and negatives of methylated CpG When the DNA bases cytosine and guanine are next to each other, a methyl group is generally added to the pyrimidine, generating a mCpG dinucleotide. This modification alters DNA structure but can also affect function by inhibiting transcription factor (TF) binding. Yin et al. systematically analyzed the effect of CpG methylation on the binding of 542 human TFs (see the Perspective by Hughes and Lambert). In addition to inhibiting binding of some TFs, they found that mCpGs can promote binding of others, particularly TFs involved in development, such as homeodomain proteins. Science, this issue p. eaaj2239; see also p. 489 Genome-scale analysis reveals positive and negative binding of transcription factors to methylated CpG dinucleotides. INTRODUCTION Nearly all cells in the human body share the same primary genome sequence consisting of four nucleotide bases. One of the bases, cytosine, is commonly modified by methylation of its 5 position in CpG dinucleotides (mCpG). Most CpG dinucleotides in the human genome are methylated, but the level of CpG methylation varies with genetic location (promoter versus gene body), whether genes are active versus silenced, and cell type. Research has shown that the maintenance of a particular cellular state after cell division is dependent on faithful transmission of methylated CpGs, as well as inheritance of the mother cells’ repertoire of transcription factors by the daughter cells. These two mechanisms of epigenetic inheritance are linked to each other; the binding of transcription factors can be affected by cytosine methylation, and cytosine methylation can, in turn, be added or removed by proteins that associate with transcription factors. RATIONALE The genetic and epigenetic language, which imparts when and where genes are expressed, is understood at a conceptual level. However, a more detailed understanding is needed of the genomic regulatory mechanism by which methylated cytosines affect transcription factor binding. Because cytosine methylation changes DNA structure, it has the potential to affect binding of all transcription factors. However, a systematic analysis of binding of a large collection of transcription factors to all possible DNA sequences has not previously been conducted. RESULTS To globally characterize the effect of cytosine methylation on transcription factor binding, we systematically analyzed binding specificities of full-length transcription factors and extended DNA binding domains to unmethylated and CpG-methylated DNA by using methylation-sensitive SELEX (systematic evolution of ligands by exponential enrichment). We evaluated binding of 542 transcription factors and identified a large number of previously uncharacterized transcription factor recognition motifs. Binding of most major classes of transcription factors, including bHLH, bZIP, and ETS, was inhibited by mCpG. In contrast, transcription factors such as homeodomain, POU, and NFAT proteins preferred to bind methylated DNA. This class of binding was enriched in factors with central roles in embryonic and organismal development. The observed binding preferences were validated using several orthogonal methods, including bisulfite-SELEX and protein-binding microarrays. In addition, the preference of the pluripotency factor OCT4 to bind to a mCpG-containing motif was confirmed by chromatin immunoprecipitation analysis in mouse embryonic stem cells with low or high levels of CpG methylation (due to deficiency in all enzymes that methylate cytosines or contribute to their removal, respectively). Crystal structure analysis of the homeodomain proteins HOXB13, CDX1, CDX2, and LHX4 revealed three key residues that contribute to the preference of this developmentally important family of transcription factors for mCpG. The preference for binding to mCpG was due to direct hydrophobic interactions with the 5-methyl group of methylcytosine. In contrast, inhibition of binding of other transcription factors to methylated sequences was found to be caused by steric hindrance. CONCLUSION Our work constitutes a global analysis of the effect of cytosine methylation on DNA binding specificities of human transcription factors. CpG methylation can influence binding of most transcription factors to DNA—in some cases negatively and in others positively. Our finding that many developmentally important transcription factors prefer to bind to mCpG sites can inform future analyses of the role of DNA methylation on cell differentiation, chromatin reprogramming, and transcriptional regulation. Systematic analysis of the impact of CpG methylation on transcription factor binding. The bottom left panel shows the fraction of transcription factors that prefer methylated (orange) or unmethylated (teal) CpG sites, are affected in multiple ways (yellow), are not affected (green), or do not have a CpG in their motifs (gray), as determined by methylation-sensitive SELEX (top left). The structure and logos on the right highlight how HOXB13 recognizes mCpG (blue shading indicates a CpG affected by methylation). The majority of CpG dinucleotides in the human genome are methylated at cytosine bases. However, active gene regulatory elements are generally hypomethylated relative to their flanking regions, and the binding of some transcription factors (TFs) is diminished by methylation of their target sequences. By analysis of 542 human TFs with methylation-sensitive SELEX (systematic evolution of ligands by exponential enrichment), we found that there are also many TFs that prefer CpG-methylated sequences. Most of these are in the extended homeodomain family. Structural analysis showed that homeodomain specificity for methylcytosine depends on direct hydrophobic interactions with the methylcytosine 5-methyl group. This study provides a systematic examination of the effect of an epigenetic DNA modification on human TF binding specificity and reveals that many developmentally important proteins display preference for mCpG-containing sequences.

[1]  Rudolf Jaenisch,et al.  One-Step Generation of Mice Carrying Mutations in Multiple Genes by CRISPR/Cas-Mediated Genome Engineering , 2013, Cell.

[2]  R. Blumenthal,et al.  Structural basis for Klf4 recognition of methylated DNA , 2014, Nucleic acids research.

[3]  F. Watt,et al.  Cytosine methylation prevents binding to DNA of a HeLa cell transcription factor required for optimal expression of the adenovirus major late promoter. , 1988, Genes & development.

[4]  Kai Zhang,et al.  A prostate cancer susceptibility allele at 6q22 increases RFX6 expression by modulating HOXB13 chromatin binding , 2014, Nature Genetics.

[5]  Esko Ukkonen,et al.  MOODS: fast search for position weight matrix matches in DNA sequences , 2009, Bioinform..

[6]  Thomas Whitington,et al.  Transcription Factor Binding in Human Cells Occurs in Dense Clusters Formed around Cohesin Anchor Sites , 2013, Cell.

[7]  Juan M. Vaquerizas,et al.  Multiplexed massively parallel SELEX for characterization of human transcription factor binding specificities. , 2010, Genome research.

[8]  Timothy R Holzberg,et al.  5-Hydroxymethylcytosine in E-box motifs ACAT|GTG and ACAC|GTG increases DNA-binding of the B-HLH transcription factor TCF4. , 2016, Integrative biology : quantitative biosciences from nano to macro.

[9]  E. Flemington,et al.  CpG methylation as a mechanism for the regulation of E2F activity. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[10]  M. Cleary,et al.  Structure of a HoxB1–Pbx1 Heterodimer Bound to DNA Role of the Hexapeptide and a Fourth Homeodomain Helix in Complex Formation , 1999, Cell.

[11]  T. Hughes,et al.  CG methylated microarrays identify a novel methylated sequence bound by the CEBPB|ATF4 heterodimer that is active in vivo , 2013, Genome research.

[12]  Jussi Taipale,et al.  Conservation of transcription factor binding specificities across 600 million years of bilateria evolution , 2015, eLife.

[13]  M. Karin,et al.  Cytosine methylation does not affect binding of transcription factor Sp1. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Vijay K. Tiwari,et al.  DNA-binding factors shape the mouse methylome at distal regulatory regions , 2011, Nature.

[15]  Michael A. Crickmore,et al.  Functional Specificity of a Hox Protein Mediated by the Recognition of Minor Groove Structure , 2007, Cell.

[16]  G. Hon,et al.  Base-Resolution Analysis of 5-Hydroxymethylcytosine in the Mammalian Genome , 2012, Cell.

[17]  W. Schaffner,et al.  Sp1 transcription factor binds DNA and activates transcription even when the binding site is CpG methylated. , 1988, Genes & development.

[18]  Xiaohui Xie,et al.  Identifying novel constrained elements by exploiting biased substitution patterns , 2009, Bioinform..

[19]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[20]  A. H. Smits,et al.  Dynamic Readers for 5-(Hydroxy)Methylcytosine and Its Oxidized Derivatives , 2013, Cell.

[21]  Fidel Ramírez,et al.  deepTools2: a next generation web server for deep-sequencing data analysis , 2016, Nucleic Acids Res..

[22]  T. Golub,et al.  Androgen-Induced Differentiation and Tumorigenicity of Human Prostate Epithelial Cells , 2004, Cancer Research.

[23]  Gangning Liang,et al.  Genome-wide mapping of nucleosome positioning and DNA methylation within individual DNA molecules , 2012, Genome research.

[24]  Mathew G. Lewsey,et al.  Cistrome and Epicistrome Features Shape the Regulatory DNA Landscape , 2016, Cell.

[25]  M. Brodsky,et al.  A bacterial one-hybrid system for determining the DNA-binding specificity of transcription factors , 2005, Nature Biotechnology.

[26]  Xiaodong Cheng,et al.  Wilms tumor protein recognizes 5-carboxylcytosine within a specific DNA sequence , 2014, Genes & development.

[27]  J. Geiger,et al.  Crystal structure of the Msx-1 homeodomain/DNA complex. , 2001, Biochemistry.

[28]  O. Kallioniemi,et al.  Dual role of FoxA1 in androgen receptor binding to chromatin, androgen signalling and prostate cancer , 2011, The EMBO journal.

[29]  A. Sharrocks A T7 expression vector for producing N- and C-terminal fusion proteins with glutathione S-transferase. , 1994, Gene.

[30]  A. Bird,et al.  DNA binding selectivity of MeCP2 due to a requirement for A/T sequences adjacent to methyl-CpG. , 2005, Molecular cell.

[31]  P. Emsley,et al.  Features and development of Coot , 2010, Acta crystallographica. Section D, Biological crystallography.

[32]  D. Schübeler Function and information content of DNA methylation , 2015, Nature.

[33]  Cynthia Wolberger,et al.  Structure of HoxA9 and Pbx1 bound to DNA: Hox hexapeptide and DNA recognition anterior to posterior. , 2003, Genes & development.

[34]  A. Holder,et al.  Antibody-based therapies for malaria , 2005, Nature Reviews Microbiology.

[35]  Madeleine P. Ball,et al.  Corrigendum: Targeted and genome-scale strategies reveal gene-body methylation signatures in human cells , 2009, Nature Biotechnology.

[36]  C. Peterson,et al.  The SANT domain: a unique histone-tail-binding module? , 2004, Nature Reviews Molecular Cell Biology.

[37]  Kevin Cowtan,et al.  research papers Acta Crystallographica Section D Biological , 2005 .

[38]  Fidencio J. Neri,et al.  Mouse regulatory DNA landscapes reveal global principles of cis-regulatory evolution , 2014, Science.

[39]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[40]  R. Chatterjee,et al.  High-resolution genome-wide DNA methylation maps of mouse primary female dermal fibroblasts and keratinocytes , 2014, Epigenetics & Chromatin.

[41]  Martha L. Bulyk,et al.  UniPROBE, update 2015: new tools and content for the online database of protein-binding microarray data on protein–DNA interactions , 2014, Nucleic Acids Res..

[42]  Anushya Muruganujan,et al.  PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements , 2016, Nucleic Acids Res..

[43]  K. Gaston,et al.  CpG methylation has differential effects on the binding of YY1 and ETS proteins to the bi-directional promoter of the Surf-1 and Surf-2 genes. , 1995, Nucleic acids research.

[44]  Randy J. Read,et al.  Overview of the CCP4 suite and current developments , 2011, Acta crystallographica. Section D, Biological crystallography.

[45]  Anaïs F. Bardet,et al.  Competition between DNA methylation and transcription factors determines binding of NRF1 , 2015, Nature.

[46]  Niko Välimäki,et al.  CTCF/cohesin-binding sites are frequently mutated in cancer , 2015, Nature Genetics.

[47]  J. Qian,et al.  DNA methylation presents distinct binding sites for human transcription factors , 2013, eLife.

[48]  Aneel K. Aggarwal,et al.  Structure of a DNA-bound Ultrabithorax–Extradenticle homeodomain complex , 1999, Nature.

[49]  Andrew C. Nelson,et al.  The PR/SET Domain Zinc Finger Protein Prdm4 Regulates Gene Expression in Embryonic Stem Cells but Plays a Nonessential Role in the Developing Mouse Embryo , 2013, Molecular and Cellular Biology.

[50]  D. Zilberman,et al.  Dnmt1-Independent CG Methylation Contributes to Nucleosome Positioning in Diverse Eukaryotes , 2014, Cell.

[51]  J. Zeitlinger,et al.  A computational pipeline for comparative ChIP-seq analyses , 2011, Nature Protocols.

[52]  D. Trono,et al.  In Embryonic Stem Cells, ZFP57/KAP1 Recognize a Methylated Hexanucleotide to Affect Chromatin and DNA Methylation of Imprinting Control Regions , 2011, Molecular cell.

[53]  P. Zwart,et al.  Towards automated crystallographic structure refinement with phenix.refine , 2012, Acta crystallographica. Section D, Biological crystallography.

[54]  Henry W. Long,et al.  The androgen receptor cistrome is extensively reprogrammed in human prostate tumorigenesis , 2015, Nature Genetics.

[55]  G. Stormo,et al.  Spec-seq: determining protein-DNA-binding specificity by sequencing. , 2015, Briefings in functional genomics.

[56]  A. N. Popov,et al.  A quantitative approach to data-collection strategies. , 2006, Acta crystallographica. Section D, Biological crystallography.

[57]  R. Shamir,et al.  A comparative analysis of transcription factor binding models learned from PBM, HT-SELEX and ChIP data , 2014, Nucleic acids research.

[58]  S. Ramaswamy,et al.  A Molecular Roadmap of Reprogramming Somatic Cells into iPS Cells , 2012, Cell.

[59]  Juan M. Vaquerizas,et al.  DNA-Binding Specificities of Human Transcription Factors , 2013, Cell.

[60]  Felix Krueger,et al.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications , 2011, Bioinform..

[61]  Wei Zhang,et al.  Suboptimization of developmental enhancers , 2015, Science.

[62]  Martin Vingron,et al.  Natural similarity measures between position frequency matrices with an application to clustering , 2008, Bioinform..

[63]  H. S. Stadler,et al.  Structural Basis for Sequence Specific DNA Binding and Protein Dimerization of HOXA13 , 2011, PloS one.

[64]  Wilfred W. Li,et al.  MEME: discovering and analyzing DNA and protein sequence motifs , 2006, Nucleic Acids Res..

[65]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[66]  A. McMahon Neural patterning: the role of Nkx genes in the ventral spinal cord. , 2000, Genes & Development.

[67]  Juan M. Vaquerizas,et al.  A census of human transcription factors: function, expression and evolution , 2009, Nature Reviews Genetics.

[68]  A. Bird,et al.  Genomic DNA methylation: the mark and its mediators. , 2006, Trends in biochemical sciences.

[69]  D. Duboule,et al.  Organizing Axes in Time and Space; 25 Years of Colinear Tinkering , 2003, Science.

[70]  P. Privalov,et al.  Forces driving the binding of homeodomains to DNA. , 2006, Biochemistry.

[71]  S. Iguchi-Ariga,et al.  CpG methylation of the cAMP-responsive enhancer/promoter sequence TGACGTCA abolishes specific factor binding as well as transcriptional activation. , 1989, Genes & development.

[72]  Y. Xiong,et al.  WT1 recruits TET2 to regulate its target gene expression and suppress leukemia cell proliferation. , 2015, Molecular cell.

[73]  Joseph C. Pearson,et al.  Modulating Hox gene functions during animal body patterning , 2005, Nature Reviews Genetics.

[74]  Andrew R. Gehrke,et al.  Genome-wide analysis of ETS-family DNA-binding in vitro and in vivo , 2010, The EMBO journal.

[75]  A. Vagin,et al.  Spherically averaged phased translation function and its application to the search for molecules and fragments in electron-density maps. , 2001, Acta crystallographica. Section D, Biological crystallography.

[76]  J. Greally,et al.  Unusual Characteristics of the DNA Binding Domain of Epigenetic Regulatory Protein MeCP2 Determine Its Binding Specificity , 2014, Biochemistry.

[77]  A. Bird DNA methylation patterns and epigenetic memory. , 2002, Genes & development.

[78]  B. Pugh,et al.  Comprehensive Genome-wide Protein-DNA Interactions Detected at Single-Nucleotide Resolution , 2011, Cell.

[79]  K. Struhl,et al.  Defining the consensus sequences of E.coli promoter elements by random selection. , 1988, Nucleic acids research.

[80]  M. Comb,et al.  CpG methylation inhibits proenkephalin gene expression and binding of the transcription factor AP-2 , 1990, Nucleic Acids Res..

[81]  A. Jolma,et al.  DNA-dependent formation of transcription factor pairs alters their binding specificity , 2015, Nature.

[82]  Kate B. Cook,et al.  Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity , 2014, Cell.

[83]  S. Yamanaka,et al.  Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors , 2006, Cell.

[84]  M. Daly,et al.  Genetic and Epigenetic Fine-Mapping of Causal Autoimmune Disease Variants , 2014, Nature.

[85]  Randy J. Read,et al.  Acta Crystallographica Section D Biological , 2003 .

[86]  Zhaohui S. Qin,et al.  Detection of differentially methylated regions from whole-genome bisulfite sequencing data without replicates , 2015, Nucleic acids research.

[87]  Brian D. Marsden,et al.  High-throughput production of human proteins for crystallization: The SGC experience , 2010, Journal of structural biology.

[88]  G. Hon,et al.  Adult tissue methylomes harbor epigenetic memory at embryonic enhancers , 2013, Nature Genetics.

[89]  R. Chatterjee,et al.  CpG methylation of half-CRE sequences creates C/EBPα binding sites that activate some tissue-specific genes , 2010, Proceedings of the National Academy of Sciences.

[90]  Lee E. Edsall,et al.  Human DNA methylomes at base resolution show widespread epigenomic differences , 2009, Nature.

[91]  K. White,et al.  Systematic mapping of occluded genes by cell fusion reveals prevalence and stability of cis-mediated silencing in somatic cells , 2014, Genome research.

[92]  Daniel E. Newburger,et al.  Diversity and Complexity in DNA Recognition by Transcription Factors , 2009, Science.

[93]  Randy J. Read,et al.  Phaser crystallographic software , 2007, Journal of applied crystallography.

[94]  Howard Y. Chang,et al.  Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position , 2013, Nature Methods.