Detecting Functional Modules of Transcription Factor Binding Sites in the Human Genome

This paper presents a method for predicting biologically meaningful modules of transcription factors. For this purpose, we employ the CORG database of conserved transcription factor binding sites. We aim at enhancing the power of in-silico binding site predictions by employing three crucial constraints. First, we rely on conserved promoter regions of orthologous genes in human and mouse, second we look for synergistic transcription factor modules which bind upstream regions preferentially together, and finally we restrict our results to those modules, whose genes have a significant functional overlap. Many of our predicted binding sites coincide with known biological facts as is evidenced by a direct comparison with a single large-scale experiment for E2F binding. We also identified known combinations of transcription factors with a functional enrichment in the set of their shared target genes. Several new modules are suggested for experimental investigation. Finally we study the transcription factor network and suggest a classification of transcription factors according to their regulatory power and control.

[1]  Martin Vingron,et al.  Annotating regulatory DNA based on man-mouse genomic comparison , 2002, ECCB.

[2]  T. Volkert,et al.  E2F integrates cell cycle progression with DNA repair, replication, and G(2)/M checkpoints. , 2002, Genes & development.

[3]  Bart De Moor,et al.  Biclustering microarray data by Gibbs sampling , 2003, ECCB.

[4]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[5]  P. Bucher,et al.  Searching for regulatory elements in human noncoding sequences. , 1997, Current opinion in structural biology.

[6]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Joseph T. Chang,et al.  Spectral biclustering of microarray data: coclustering genes and conditions. , 2003, Genome research.

[8]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[9]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[10]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[11]  R. Sharan,et al.  Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells. , 2003, Genome research.

[12]  R. Tjian,et al.  Transcription regulation and animal diversity , 2003, Nature.