Evolutionary Conserved Motif Finder (ECMFinder) for genome-wide identification of clustered YY1- and CTCF-binding sites

We have developed a new bioinformatics approach called ECMFinder (Evolutionary Conserved Motif Finder). This program searches for a given DNA motif within the entire genome of one species and uses the gene association information of a potential transcription factor-binding site (TFBS) to screen the homologous regions of a second and third species. If multiple species have this potential TFBS in homologous positions, this program recognizes the identified TFBS as an evolutionary conserved motif (ECM). This program outputs a list of ECMs, which can be uploaded as a Custom Track in the UCSC genome browser and can be visualized along with other available data. The feasibility of this approach was tested by searching the genomes of three mammals (human, mouse and cow) with the DNA-binding motifs of YY1 and CTCF. This program successfully identified many clustered YY1- and CTCF-binding sites that are conserved among these species but were previously undetected. In particular, this program identified CTCF-binding sites that are located close to the Dlk1, Magel2 and Cdkn1c imprinted genes. Individual ChIP experiments confirmed the in vivo binding of the YY1 and CTCF proteins to most of these newly discovered binding sites, demonstrating the feasibility and usefulness of ECMFinder.

[1]  M. Ludwig,et al.  Functional evolution of noncoding DNA. , 2002, Current opinion in genetics & development.

[2]  Joomyeong Kim Multiple YY1 and CTCF binding sites in imprinting control regions , 2008, Epigenetics.

[3]  M. Blanchette,et al.  Discovery of regulatory elements by a computational method for phylogenetic footprinting. , 2002, Genome research.

[4]  F. Robert,et al.  Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression , 2006 .

[5]  A. West,et al.  Conserved CTCF insulator elements flank the mouse and human beta-globin loci. , 2002, Molecular and cellular biology.

[6]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[7]  Daniel J. Blankenberg,et al.  28-way vertebrate alignment and conservation track in the UCSC Genome Browser. , 2007, Genome research.

[8]  G. Felsenfeld,et al.  Critical DNA Binding Interactions of the Insulator Protein CTCF , 2007, Journal of Biological Chemistry.

[9]  M. Bulyk Computational prediction of transcription-factor binding site locations , 2003, Genome Biology.

[10]  Michael Q. Zhang,et al.  Analysis of the Vertebrate Insulator Protein CTCF-Binding Sites in the Human Genome , 2007, Cell.

[11]  A. West,et al.  The Protein CTCF Is Required for the Enhancer Blocking Activity of Vertebrate Insulators , 1999, Cell.

[12]  L. Matthews,et al.  The Evolution of the DLK1-DIO3 Imprinted Domain in Mammals , 2008, PLoS biology.

[13]  Axel Visel,et al.  Enhancer identification through comparative genomics. , 2006, Seminars in cell & developmental biology.

[14]  I. Ovcharenko,et al.  Identification of clustered YY1 binding sites in imprinting control regions. , 2006, Genome research.

[15]  M. Nóbrega,et al.  Comparative genomics at the vertebrate extremes , 2004, Nature Reviews Genetics.

[16]  D. Guhathakurta,et al.  Computational identification of transcriptional regulatory elements in DNA sequence , 2006, Nucleic acids research.

[17]  N. D. Clarke,et al.  Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem Cells , 2008, Cell.

[18]  Ivan Ovcharenko,et al.  ECRbase: database of evolutionary conserved regions, promoters, and transcription factor binding sites in vertebrate genomes , 2007, Bioinform..

[19]  G. Felsenfeld,et al.  Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene , 2000, Nature.

[20]  Dustin E. Schones,et al.  High-Resolution Profiling of Histone Methylations in the Human Genome , 2007, Cell.

[21]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[22]  Martin Tompa,et al.  Discovery of regulatory elements in vertebrates through comparative genomics , 2005, Nature Biotechnology.

[23]  Webb Miller,et al.  Evolution and functional classification of vertebrate gene deserts. , 2005, Genome research.

[24]  Shirley M. Tilghman,et al.  CTCF mediates methylation-sensitive enhancer-blocking activity at the H19/Igf2 locus , 2000, Nature.

[25]  A. West,et al.  Conserved CTCF Insulator Elements Flank the Mouse and Human β-Globin Loci , 2002, Molecular and Cellular Biology.