Gene expression Advance Access publication May 4, 2011 DREME: motif discovery in transcription factor ChIP-seq data

Motivation: Transcription factor (TF) ChIP-seq datasets have particular characteristics that provide unique challenges and opportunities for motif discovery. Most existing motif discovery algorithms do not scale well to such large datasets, or fail to report many motifs associated with cofactors of the ChIP-ed TF. Results: We present DREME, a motif discovery algorithm specifically designed to find the short, core DNA-binding motifs of eukaryotic TFs, and optimized to analyze very large ChIP-seq datasets in minutes. Using DREME, we discover the binding motifs of the the ChIP-ed TF and many cofactors in mouse ES cell (mESC), mouse erythrocyte and human cell line ChIP-seq datasets. For example, in mESC ChIP-seq data for the TF Esrrb, we discover the binding motifs for eight cofactor TFs important in the maintenance of pluripotency. Several other commonly used algorithms find at most two cofactor motifs in this same dataset. DREME can also perform discriminative motif discovery, and we use this feature to provide evidence that Sox2 and Oct4 do not bind in mES cells as an obligate heterodimer. DREME is much faster than many commonly used algorithms, scales linearly in dataset size, finds multiple, non-redundant motifs and reports a reliable measure of statistical significance for each motif found. DREME is available as part of the MEME Suite of motif-based sequence analysis tools (http://meme.nbcr.net). Contact: t.bailey@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Annemarie H Meijer,et al.  Macrophage-specific gene functions in Spi1-directed innate immunity. , 2010, Blood.

[2]  Mi-Sun Kim,et al.  RETRACTED: Ligand-induced transrepressive function of VDR requires a chromatin remodeling complex, WINAC , 2007, The Journal of Steroid Biochemistry and Molecular Biology.

[3]  L. Stanton,et al.  The Pluripotency Regulator Zic3 Is a Direct Activator of the Nanog Promoter in ESCs , 2010, Stem cells.

[4]  Chilakamarti V. Ramana,et al.  Stat1-Vitamin D Receptor Interactions Antagonize 1,25-Dihydroxyvitamin D Transcriptional Activity and Enhance Stat1-Mediated Transcription , 2002, Molecular and Cellular Biology.

[5]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[6]  T. Hubbard,et al.  NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence , 2005, Nucleic acids research.

[7]  E. Birney,et al.  High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. , 2011, Genome research.

[8]  Ibrahim Emam,et al.  Gene Expression Atlas at the European Bioinformatics Institute , 2009, Nucleic Acids Res..

[9]  Francesca Chiaromonte,et al.  Erythroid GATA 1 function revealed by genome-wide analysis of transcription factor occupancy , histone modifications , and mRNA expression , 2009 .

[10]  M. Facciotti,et al.  Evaluation of Algorithm Performance in ChIP-Seq Peak Detection , 2010, PloS one.

[11]  Gavin Giovannoni,et al.  A ChIP-seq defined genome-wide map of vitamin D receptor binding: associations with disease and evolution. , 2010, Genome research.

[12]  N. D. Clarke,et al.  Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem Cells , 2008, Cell.

[13]  Mark Bieda,et al.  Unbiased location analysis of E2F1-binding sites suggests a widespread role for E2F1 in the human genome. , 2006, Genome research.

[14]  Charles Elkan,et al.  The Value of Prior Knowledge in Discovering Motifs with MEME , 1995, ISMB.

[15]  Saurabh Sinha,et al.  YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation , 2003, Nucleic Acids Res..

[16]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[17]  Masayuki Yamamoto,et al.  Runx1 is involved in primitive erythropoiesis in the mouse. , 2008, Blood.

[18]  William Stafford Noble,et al.  Quantifying similarity between motifs , 2007, Genome Biology.

[19]  Juan M. Vaquerizas,et al.  A census of human transcription factors: function, expression and evolution , 2009, Nature Reviews Genetics.

[20]  A. Aranda,et al.  Association with Ets-1 Causes Ligand- and AF2-Independent Activation of Nuclear Receptors , 2000, Molecular and Cellular Biology.

[21]  R. Shamir,et al.  Transcription factor and microRNA motif discovery: the Amadeus platform and a compendium of metazoan target sets. , 2008, Genome research.

[22]  Timothy L Bailey,et al.  A global role for KLF1 in erythropoiesis revealed by ChIP-seq in primary erythroid cells. , 2010, Genome research.

[23]  Wyeth W. Wasserman,et al.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles , 2004, Nucleic Acids Res..

[24]  Megan F. Cole,et al.  Connecting microRNA Genes to the Core Transcriptional Regulatory Circuitry of Embryonic Stem Cells , 2008, Cell.

[25]  M. Berger,et al.  Universal protein-binding microarrays for the comprehensive characterization of the DNA-binding specificities of transcription factors , 2009, Nature Protocols.

[26]  Carola Bruna,et al.  Evolution of the interaction between Runx2 and VDR, two transcription factors involved in osteoblastogenesis , 2010, BMC Evolutionary Biology.

[27]  Alexander E. Kel,et al.  TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes , 2005, Nucleic Acids Res..

[28]  A. Sharov,et al.  Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder , 2009, DNA research : an international journal for rapid publication of reports on genes and genomes.

[29]  E. Birney,et al.  Trawler: de novo regulatory motif discovery pipeline for chromatin immunoprecipitation , 2007, Nature Methods.

[30]  Qing Zhou,et al.  Identification of Context-Dependent Motifs by Contrasting ChIP Binding Data , 2010, Bioinform..

[31]  T. Rabbitts,et al.  The LIM‐only protein Lmo2 is a bridging molecule assembling an erythroid, DNA‐binding complex which includes the TAL1, E47, GATA‐1 and Ldb1/NLI proteins , 1997, The EMBO journal.

[32]  Raymond C Stevens,et al.  Crystal structure and DNA binding of the homeodomain of the stem cell transcription factor Nanog. , 2008, Journal of molecular biology.

[33]  Graziano Pesole,et al.  Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes , 2004, Nucleic Acids Res..

[34]  Shamit Soneji,et al.  Genome-wide identification of TAL1's functional targets: insights into its mechanisms of action in primary erythroid cells. , 2010, Genome research.

[35]  Nir Friedman,et al.  A Simple Hyper-Geometric Approach for Discovering Putative Transcription Factor Binding Sites , 2001, WABI.