Systematic analysis of naturally occurring insertions and deletions that alter transcription factor spacing identifies tolerant and sensitive transcription factor pairs

Regulation of gene expression requires the combinatorial binding of sequence-specific transcription factors (TFs) at promoters and enhancers. Prior studies showed that alterations in the spacing between TF binding sites can influence promoter and enhancer activity. However, the relative importance of TF spacing alterations resulting from naturally occurring insertions and deletions (InDels) has not been systematically analyzed. To address this question, we first characterized the genome-wide spacing relationships of 75 TFs in K562 cells as determined by ChIP-sequencing. We found a dominant pattern of a relaxed range of spacing between collaborative factors, including 46 TFs exclusively exhibiting relaxed spacing with their binding partners. Next, we exploited millions of InDels provided by genetically diverse mouse strains and human individuals to investigate the effects of altered spacing on TF binding and local histone acetylation. Spacing alterations resulting from naturally occurring InDels are generally tolerated in comparison to genetic variants directly affecting TF binding sites. A remarkable range of tolerance was further established for PU.1 and C/EBPβ, which exhibit relaxed spacing, by introducing synthetic spacing alterations ranging from 5-bp increase to >30-bp decrease using CRISPR/Cas9 mutagenesis. These findings provide implications for understanding mechanisms underlying enhancer selection and for the interpretation of non-coding genetic variation.

[1]  V. Beneš,et al.  Molecular Co-occupancy Identifies Transcription Factor Binding Cooperativity In Vivo. , 2020, Molecular cell.

[2]  C. Glass,et al.  Mechanisms underlying divergent responses of genetically distinct macrophages to IL-4 , 2020, Science Advances.

[3]  M. Whalen,et al.  Systems Genetics in Human Endothelial Cells Identifies Non-coding Variants Modifying Enhancers, Expression, and Complex Disease Traits. , 2020, American journal of human genetics.

[4]  Zhengyu Ouyang,et al.  MAGGIE: leveraging genetic variation to identify DNA sequence motifs mediating transcription factor binding and function , 2020, bioRxiv.

[5]  V. Fellman,et al.  A sensitive assay for dNTPs based on long synthetic oligonucleotides, EvaGreen dye and inhibitor-resistant high-fidelity DNA polymerase , 2019, bioRxiv.

[6]  Phillip A. Richmond,et al.  JASPAR 2020: update of the open-access database of transcription factor binding profiles , 2019, Nucleic Acids Res..

[7]  Kornel Labun,et al.  CHOPCHOP v3: expanding the CRISPR web toolbox beyond genome editing , 2019, Nucleic Acids Res..

[8]  Maitreya J. Dunham,et al.  A combination of transcription factors mediates inducible interchromosomal contacts , 2019, eLife.

[9]  Ryan L. Collins,et al.  The mutational constraint spectrum quantified from variation in 141,456 humans , 2020, Nature.

[10]  Hui Hu,et al.  AnimalTFDB 3.0: a comprehensive resource for annotation and prediction of animal transcription factors , 2018, Nucleic Acids Res..

[11]  Maitreya J. Dunham,et al.  A combination of transcription factors mediates inducible interchromosomal pairing , 2018, bioRxiv.

[12]  J. Stender,et al.  Diverse motif ensembles specify non-redundant DNA binding activities of AP-1 family members in macrophages , 2018, Nature Communications.

[13]  C. Glass,et al.  MMARGE: Motif Mutation Analysis for Regulatory Genomic Elements , 2018, Nucleic acids research.

[14]  C. Glass,et al.  Analysis of Genetically Diverse Macrophages Reveals Local and Domain-wide Mechanisms that Control Transcription Factor Binding and Function , 2018, Cell.

[15]  K. Tan,et al.  Exploiting genetic variation to uncover rules of transcription factor binding and chromatin accessibility , 2018, Nature Communications.

[16]  E. Morgunova,et al.  Structural perspective of cooperative transcription factor binding. , 2017, Current opinion in structural biology.

[17]  J. Michael Cherry,et al.  The Encyclopedia of DNA elements (ENCODE): data portal update , 2017, Nucleic Acids Res..

[18]  P. Visscher,et al.  10 Years of GWAS Discovery: Biology, Function, and Translation. , 2017, American journal of human genetics.

[19]  James R. Springstead,et al.  Transcriptional networks specifying homeostatic and inflammatory programs of gene expression in human aortic endothelial cells , 2017, eLife.

[20]  Sharon R Grossman,et al.  Systematic dissection of genomic features determining transcription factor binding and enhancer function , 2017, Proceedings of the National Academy of Sciences.

[21]  Helen E. Parkinson,et al.  The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) , 2016, Nucleic Acids Res..

[22]  B. Deplancke,et al.  The Genetics of Transcription Factor DNA Binding Variation , 2016, Cell.

[23]  D. Walther,et al.  The orientation of transcription factor binding site motifs in gene promoter regions: does it matter? , 2016, BMC Genomics.

[24]  David Sankoff,et al.  Locating rearrangement events in a phylogeny based on highly fragmented assemblies , 2016, BMC Genomics.

[25]  A. Jolma,et al.  DNA-dependent formation of transcription factor pairs alters their binding specificity , 2015, Nature.

[26]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[27]  Wei Zhang,et al.  Suboptimization of developmental enhancers , 2015, Science.

[28]  C. Glass,et al.  The selection and function of cell type-specific enhancers , 2015, Nature Reviews Molecular Cell Biology.

[29]  Felicia S. L. Ng,et al.  Constrained transcription factor spacing is prevalent and important for transcriptional control of mouse blood cells , 2014, Nucleic Acids Research.

[30]  Fidencio J. Neri,et al.  Mouse regulatory DNA landscapes reveal global principles of cis-regulatory evolution , 2014, Science.

[31]  M. Daly,et al.  Genetic and Epigenetic Fine-Mapping of Causal Autoimmune Disease Variants , 2014, Nature.

[32]  Matthew Slattery,et al.  Absence of a simple code: how transcription factors read the genome. , 2014, Trends in biochemical sciences.

[33]  Mona Singh,et al.  CCAT: Combinatorial Code Analysis Tool for transcriptional regulation , 2013, Nucleic acids research.

[34]  C. Glass,et al.  Impact of natural genetic variation on enhancer selection and function , 2013, Nature.

[35]  Stein Aerts,et al.  Genome-wide analyses of Shavenbaby target genes reveals distinct features of enhancer organization , 2013, Genome Biology.

[36]  Ilya Ioshikhes,et al.  Identification of cis-regulatory modules in promoters of human genes exploiting mutual positioning of transcription factors , 2013, Nucleic acids research.

[37]  J. Shendure,et al.  Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model , 2013, Nature Genetics.

[38]  Manolis Kellis,et al.  Interpreting non-coding variation in complex disease genetics , 2012, Nature Biotechnology.

[39]  Gail M. Sullivan,et al.  Using Effect Size-or Why the P Value Is Not Enough. , 2012, Journal of graduate medical education.

[40]  William Stafford Noble,et al.  Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors , 2012, Genome research.

[41]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[42]  J. Carroll,et al.  Pioneer transcription factors: establishing competence for gene expression. , 2011, Genes & development.

[43]  Thomas M. Keane,et al.  Mouse genomic variation and its effect on phenotypes and gene regulation , 2011, Nature.

[44]  Peter J. Bickel,et al.  Measuring reproducibility of high-throughput experiments , 2011, 1110.4705.

[45]  R. Young,et al.  Histone H3K27ac separates active from poised enhancers and predicts developmental state , 2010, Proceedings of the National Academy of Sciences.

[46]  G. Bourque,et al.  Transposable elements have rewired the core regulatory network of human embryonic stem cells , 2010, Nature Genetics.

[47]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[48]  J. Ragoussis,et al.  A Large Fraction of Extragenic RNA Pol II Transcription Sites Overlap Enhancers , 2010, PLoS biology.

[49]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[50]  E. Liu,et al.  Evolution of the mammalian transcription factor binding repertoire via transposable elements. , 2008, Genome research.

[51]  G. Wang,et al.  Quantitative production of macrophages or neutrophils ex vivo using conditional Hoxb8 , 2006, Nature Methods.

[52]  P. Robson,et al.  Transcriptional Regulation of Nanog by OCT4 and SOX2* , 2005, Journal of Biological Chemistry.

[53]  Matthias Wilmanns,et al.  Crystal structure of a POU/HMG/DNA ternary complex suggests differential assembly of Oct4 and Sox2 on two enhancers. , 2003, Genes & development.

[54]  A. Nakashima,et al.  Interactions between Egr1 and AP1 factors in regulation of tyrosine hydroxylase transcription. , 2003, Brain research. Molecular brain research.

[55]  A. Reményi,et al.  Crystal structure of a POU/HMG/DNA ternary complex , 2003 .

[56]  J. Baraban,et al.  A Dominant Negative Egr Inhibitor Blocks Nerve Growth Factor-Induced Neurite Outgrowth by Suppressing c-Jun Activation: Role of an Egr/c-Jun Complex , 2002, The Journal of Neuroscience.

[57]  A. Rao,et al.  Partners in transcription: NFAT and AP-1 , 2001, Oncogene.

[58]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[59]  Daniel Panne,et al.  The enhanceosome. , 2008, Current opinion in structural biology.