DoOPSearch: a web-based tool for finding and analysing common conserved motifs in the promoter regions of different chordate and plant genes

BackgroundThe comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s).ResultsWe have developed a new tool called DoOPSearch http://doopsearch.abc.hu for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program.ConclusionWe present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes.

[1]  E. Barta,et al.  Highly conserved proximal promoter element harbouring paired Sox9-binding sites contributes to the tissue- and developmental stage-specific activity of the matrilin-1 gene. , 2005, The Biochemical journal.

[2]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[3]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[4]  Kathleen Marchal,et al.  Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes1 , 2003, Plant Physiology.

[5]  S. Ruben,et al.  Selection of optimal kappa B/Rel DNA-binding motifs: interaction of both subunits of NF-kappa B with DNA is required for transcriptional activation , 1992, Molecular and cellular biology.

[6]  M. Blanchette,et al.  Discovery of regulatory elements by a computational method for phylogenetic footprinting. , 2002, Genome research.

[7]  Andreas Prlic,et al.  Ensembl 2008 , 2007, Nucleic Acids Res..

[8]  Obi L. Griffith,et al.  cisRED: a database system for genome-scale computational discovery of regulatory elements , 2005, Nucleic Acids Res..

[9]  Gábor Tóth,et al.  DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants , 2004, Nucleic Acids Res..

[10]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[11]  Ole Winther,et al.  JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update , 2007, Nucleic Acids Res..

[12]  Endre Barta,et al.  Comparative genomics-based orthologous promoter analysis using the DoOP database and the DoOPSearch web tool. , 2007, Methods in molecular biology.

[13]  Andrea Tanzer,et al.  Comparative promoter region analysis powered by CORG , 2005, BMC Genomics.

[14]  Burkhard Morgenstern,et al.  DIALIGN2: Improvement of the segment to segment approach to multiple sequence alignment , 1999, German Conference on Bioinformatics.

[15]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[16]  Pieter J. De Bleser,et al.  ConTra: a promoter alignment analysis tool for identification of transcription factor binding sites across species , 2008, Nucleic Acids Res..

[17]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[18]  Edgar Wingender,et al.  The TRANSFAC project as an example of framework technology that supports the analysis of genomic regulation , 2008, Briefings Bioinform..

[19]  Nancy F. Hansen,et al.  Comparative analyses of multi-species sequences from targeted genomic regions , 2003, Nature.

[20]  Daniel L. Hartl,et al.  GeneMerge - Post-genomic Analysis, Data Mining, and Hypothesis Testing , 2003, Bioinform..

[21]  P. Bucher Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. , 1990, Journal of molecular biology.

[22]  Ramana V. Davuluri,et al.  OMGProm: a database of orthologous mammalian gene promoters , 2005, Bioinform..

[23]  Lucas D. Ward,et al.  Predicting functional transcription factor binding through alignment-free and affinity-based analysis of orthologous promoter sequences , 2008, ISMB.

[24]  Valer Gotea,et al.  DiRE: identifying distant regulatory elements of co-expressed genes , 2008, Nucleic Acids Res..