MEME-ChIP: motif analysis of large DNA datasets

Motivation: Advances in high-throughput sequencing have resulted in rapid growth in large, high-quality datasets including those arising from transcription factor (TF) ChIP-seq experiments. While there are many existing tools for discovering TF binding site motifs in such datasets, most web-based tools cannot directly process such large datasets. Results: The MEME-ChIP web service is designed to analyze ChIP-seq ‘peak regions’—short genomic regions surrounding declared ChIP-seq ‘peaks’. Given a set of genomic regions, it performs (i) ab initio motif discovery, (ii) motif enrichment analysis, (iii) motif visualization, (iv) binding affinity analysis and (v) motif identification. It runs two complementary motif discovery algorithms on the input data—MEME and DREME—and uses the motifs they discover in subsequent visualization, binding affinity and identification steps. MEME-ChIP also performs motif enrichment analysis using the AME algorithm, which can detect very low levels of enrichment of binding sites for TFs with known DNA-binding motifs. Importantly, unlike with the MEME web service, there is no restriction on the size or number of uploaded sequences, allowing very large ChIP-seq datasets to be analyzed. The analyses performed by MEME-ChIP provide the user with a varied view of the binding and regulatory activity of the ChIP-ed TF, as well as the possible involvement of other DNA-binding TFs. Availability: MEME-ChIP is available as part of the MEME Suite at http://meme.nbcr.net. Contact: t.bailey@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Shamit Soneji,et al.  Genome-wide identification of TAL1's functional targets: insights into its mechanisms of action in primary erythroid cells. , 2010, Genome research.

[2]  Timothy L. Bailey,et al.  Gene expression Advance Access publication May 4, 2011 DREME: motif discovery in transcription factor ChIP-seq data , 2011 .

[3]  Timothy L. Bailey,et al.  Motif Enrichment Analysis: a unified framework and an evaluation on ChIP data , 2010, BMC Bioinformatics.

[4]  E. Birney,et al.  Trawler: de novo regulatory motif discovery pipeline for chromatin immunoprecipitation , 2007, Nature Methods.

[5]  William Stafford Noble,et al.  Quantifying similarity between motifs , 2007, Genome Biology.

[6]  David J. Arenillas,et al.  JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles , 2009, Nucleic Acids Res..

[7]  Jacques van Helden,et al.  RSAT: regulatory sequence analysis tools , 2008, Nucleic Acids Res..

[8]  A. Mortazavi,et al.  Computation for ChIP-seq and RNA-seq studies , 2009, Nature Methods.

[9]  Mikael Bodén,et al.  Assigning roles to DNA regulatory motifs using comparative genomics , 2010, Bioinform..

[10]  Michael Gribskov,et al.  Combining evidence using p-values: application to sequence homology searches , 1998, Bioinform..

[11]  Wilfred W. Li,et al.  MEME: discovering and analyzing DNA and protein sequence motifs , 2006, Nucleic Acids Res..

[12]  Timothy L Bailey,et al.  A global role for KLF1 in erythropoiesis revealed by ChIP-seq in primary erythroid cells. , 2010, Genome research.