Finding functional promoter motifs by computational methods: a word of caution

The standard practice in the analysis of promoters is to select promoter regions of convenient length. This may lead to false results when searching for Transcription Factor Binding Sites (TFBSs), since the sequences may contain coding segments. In such cases, motif detection may single out motifs from the coding regions. The mapping of TFBSs to promoters may result in a misleading picture of 'promoter' content. We illustrate these issues using the example of histones H2A and H2B and show how such analysis could be misleading if care is not exercised to eliminate coding regions from the presumed promoter sequences.

[1]  J. Fickett,et al.  Identification of regulatory regions which confer muscle-specific gene expression. , 1998, Journal of molecular biology.

[2]  Mathieu Blanchette,et al.  An empirical comparison of tools for phylogenetic footprinting , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[3]  Martin C. Frith,et al.  Detection of cis -element clusters in higher eukaryotic DNA , 2001, Bioinform..

[4]  O. Witt,et al.  Histones: genetic diversity and tissue-specific gene expression , 1997, Histochemistry and Cell Biology.

[5]  Z. Weng,et al.  Detection of functional DNA motifs via statistical over-representation. , 2004, Nucleic acids research.

[6]  Patricia Soteropoulos,et al.  EZ-Retrieve: a web-server for batch retrieval of coordinate-specified human DNA sequences and underscoring putative transcription factor-binding sites. , 2002, Nucleic acids research.

[7]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[8]  Mathieu Blanchette,et al.  Motif Discovery in Heterogeneous Sequence Data , 2003, Pacific Symposium on Biocomputing.

[9]  D. Landsman,et al.  Statistical analysis of over-represented words in human promoter sequences. , 2004, Nucleic acids research.

[10]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[11]  C. Vinson,et al.  Clustering of DNA sequences in human promoters. , 2004, Genome research.

[12]  E. Kardalinou,et al.  The Human H2A and H2B Histone Gene Complement , 1999, Biological chemistry.

[13]  J. T. Kadonaga,et al.  The RNA polymerase II core promoter: a key component in the regulation of gene expression. , 2002, Genes & development.

[14]  Vladimir B. Bajic,et al.  Content analysis of the core promoter region of human genes , 2003, Silico Biol..

[15]  Alexander E. Kel,et al.  Composition-sensitive analysis of the human genome for regulatory signals , 2003, Silico Biol..

[16]  Jean-Michel Claverie,et al.  Assessing the biological significance of primary structure consensus patterns using sequence databanks. I. Heat-shock and glucocorticoid control elements in eukaryotic promoters , 1985, Comput. Appl. Biosci..

[17]  C. Handschin,et al.  NUBIScan, an in silico approach for prediction of nuclear receptor response elements. , 2002, Molecular endocrinology.

[18]  D. Dean,et al.  Chromatin remodeling and transcriptional regulation. , 1999, Journal of the National Cancer Institute.

[19]  W. Wasserman,et al.  A predictive model for regulatory sequences directing liver-specific transcription. , 2001, Genome research.

[20]  Jiashun Zheng,et al.  An approach to identify over-represented cis-elements in related sequences. , 2003, Nucleic acids research.

[21]  Michael Gribskov,et al.  Combining evidence using p-values: application to sequence homology searches , 1998, Bioinform..

[22]  W. Albig,et al.  The expression of human H2A-H2B histone gene pairs is regulated by multiple sequence elements in their joint promoters. , 1999, Biochimica et biophysica acta.