Massively parallel in vivo enhancer assay reveals that highly local features determine the cis-regulatory function of ChIP-seq peaks

Transcription factors (TFs) recognize short sequence motifs that are present in millions of copies in large eukaryotic genomes. TFsmust distinguish their target binding sites from a vast genomic excess of spurious motif occurrences; however, it is unclear whether functional sites are distinguished from nonfunctional motifs by local primary sequence features or by the larger genomic context in which motifs reside. We used a massively parallel enhancer assay in living mouse retinas to compare 1,300 sequences bound in the genome by the photoreceptor transcription factor Cone-rod homeobox (Crx), to 3,000 control sequences. We found that very short sequences bound in the genome by Crx activated transcription at high levels, whereas unbound genomic regions with equal numbers of Crx motifs did not activate above background levels, even when liberated from their larger genomic context. High local GC content strongly distinguishes bound motifs from unbound motifs across the entire genome. Our results show that the cis-regulatory potential of TF-bound DNA is determined largely by highly local sequence features and not by genomic context.

[1]  A. Stark,et al.  Uncovering cis-regulatory sequence requirements for context-specific transcription factor binding , 2012, Genome research.

[2]  Z. Weng,et al.  Sequence features that drive human promoter function and tissue specificity. , 2010, Genome research.

[3]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[4]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[5]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[6]  J. Lee,et al.  Quantitative fine-tuning of photoreceptor cis-regulatory elements through affinity modulation of transcription factor binding sites , 2010, Gene Therapy.

[7]  Yue Zhao,et al.  Inferring Binding Energies from Selected Binding Sites , 2009, PLoS Comput. Biol..

[8]  Philip Machanick,et al.  MEME-ChIP: motif analysis of large DNA datasets , 2011, Bioinform..

[9]  J. Stamatoyannopoulos,et al.  Chromatin accessibility pre-determines glucocorticoid receptor binding patterns , 2011, Nature Genetics.

[10]  J. Corbo,et al.  The Cis-regulatory Logic of the Mammalian Photoreceptor Transcriptional Network , 2007, PloS one.

[11]  Joseph B Hiatt,et al.  Massively parallel functional dissection of mammalian enhancers in vivo , 2012, Nature Biotechnology.

[12]  Michael J. Guertin,et al.  Chromatin Landscape Dictates HSF Binding to Target DNA Elements , 2010, PLoS genetics.

[13]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[14]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[15]  Irene K. Moore,et al.  High Nucleosome Occupancy Is Encoded at Human Regulatory Sequences , 2010, PloS one.

[16]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[17]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[18]  E. Eyras,et al.  Nucleosome-driven transcription factor binding and gene regulation. , 2013, Molecular cell.

[19]  Irene K. Moore,et al.  The DNA-encoded nucleosome organization of a eukaryotic genome , 2009, Nature.

[20]  Neil D Clarke,et al.  Whole-genome comparison of Leu3 binding in vitro and in vivo reveals the importance of nucleosome occupancy in target site selection. , 2006, Genome research.

[21]  Barak A. Cohen,et al.  Complex effects of nucleotide variants in a mammalian cis-regulatory element , 2012, Proceedings of the National Academy of Sciences.

[22]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[23]  W. L. Ruzzo,et al.  Genome-wide MyoD binding in skeletal muscle cells: a potential for broad cellular reprogramming. , 2010, Developmental cell.

[24]  Timothy R. Hughes,et al.  G+C content dominates intrinsic nucleosome occupancy , 2009, BMC Bioinformatics.

[25]  James B. Brown,et al.  Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions , 2009, Genome Biology.

[26]  Z. Weng,et al.  Functional analysis of transcription factor binding sites in human promoters , 2012, Genome Biology.

[27]  Timothy L. Bailey,et al.  Gene expression Advance Access publication May 4, 2011 DREME: motif discovery in transcription factor ChIP-seq data , 2011 .

[28]  M. Biggin Animal transcription networks as highly connected, quantitative continua. , 2011, Developmental cell.

[29]  Shane J. Neph,et al.  An expansive human regulatory lexicon encoded in transcription factor footprints , 2012, Nature.

[30]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[31]  William Stafford Noble,et al.  Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors , 2012, Genome research.

[32]  R. Mann,et al.  The role of DNA shape in protein-DNA recognition , 2009, Nature.

[33]  T. Bailey,et al.  Inferring direct DNA binding from ChIP-seq , 2012, Nucleic acids research.

[34]  Nathan C. Sheffield,et al.  The accessible chromatin landscape of the human genome , 2012, Nature.

[35]  V. Beneš,et al.  CRX ChIP-seq reveals the cis-regulatory architecture of mouse photoreceptors. , 2010, Genome research.

[36]  D. Zack,et al.  The basic motif-leucine zipper transcription factor Nrl can positively regulate rhodopsin gene expression. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Stephen C. J. Parker,et al.  A map of minor groove shape and electrostatic potential from hydroxyl radical cleavage patterns of DNA. , 2011, ACS chemical biology.

[38]  Donald J Zack,et al.  Crx, a Novel Otx-like Paired-Homeodomain Protein, Binds to and Transactivates Photoreceptor Cell-Specific Genes , 1997, Neuron.

[39]  Emily M. LeProust,et al.  Synthesis of high-quality libraries of long (150mer) oligonucleotides by a novel depurination controlled process , 2010, Nucleic acids research.

[40]  R. Mann,et al.  Cofactor Binding Evokes Latent Differences in DNA Binding Specificity between Hox Proteins , 2011, Cell.

[41]  E. Furlong,et al.  Transcription factors: from enhancer binding to developmental control , 2012, Nature Reviews Genetics.