Summary: Current methods for motif discovery from chromatin immunoprecipitation followed by sequencing (ChIP-seq) data often identify non-targeted transcription factor (TF) motifs, and are even further limited when peak sequences are similar due to common ancestry rather than common binding factors. The latter aspect particularly affects a large number of proteins from the Cys2His2 zinc finger (C2H2-ZF) class of TFs, as their binding sites are often dominated by endogenous retroelements that have highly similar sequences. Here, we present recognition code-assisted discovery of regulatory elements (RCADE) for motif discovery from C2H2-ZF ChIP-seq data. RCADE combines predictions from a DNA recognition code of C2H2-ZFs with ChIP-seq data to identify models that represent the genuine DNA binding preferences of C2H2-ZF proteins. We show that RCADE is able to identify generalizable binding models even from peaks that are exclusively located within the repeat regions of the genome, where state-of-the-art motif finding approaches largely fail. Availability and implementation: RCADE is available as a webserver and also for download at http://rcade.ccbr.utoronto.ca/. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: t.hughes@utoronto.ca
[1]
Data production leads,et al.
An integrated encyclopedia of DNA elements in the human genome
,
2012
.
[2]
T. Bailey,et al.
Inferring direct DNA binding from ChIP-seq
,
2012,
Nucleic acids research.
[3]
Gary D. Stormo,et al.
Program in Gene Function and Expression Publications and Presentations Program in Gene Function and Expression 4-2014 An improved predictive recognition model for Cys 2-His 2 zinc finger proteins
,
2014
.
[4]
Jeremy M. Berg,et al.
Zinc-finger proteins
,
1993
.
[5]
Mona Singh,et al.
De novo prediction of DNA-binding specificities for Cys2His2 zinc finger proteins
,
2013,
Nucleic acids research.
[6]
Charles Elkan,et al.
Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer
,
1994,
ISMB.
[7]
ENCODEConsortium,et al.
An Integrated Encyclopedia of DNA Elements in the Human Genome
,
2012,
Nature.
[8]
Mihai Albu,et al.
C2H2 zinc finger proteins greatly expand the human regulatory lexicon
,
2015,
Nature Biotechnology.