Evolution of transcription factor binding sites in Mammalian gene regulatory regions: conservation and turnover.

Comparisons between human and rodent DNA sequences are widely used for the identification of regulatory regions (phylogenetic footprinting), and the importance of such intergenomic comparisons for promoter annotation is expanding. The efficacy of such comparisons for the identification of functional regulatory elements hinges on the evolutionary dynamics of promoter sequences. Although it is widely appreciated that conservation of sequence motifs may provide a suggestion of function, it is not known as to what proportion of the functional binding sites in humans is conserved in distant species. In this report, we present an analysis of the evolutionary dynamics of transcription factor binding sites whose function had been experimentally verified in promoters of 51 human genes and compare their sequence to homologous sequences in other primate species and rodents. Our results show that there is extensive divergence within the nucleotide sequence of transcription factor binding sites. Using direct experimental data from functional studies in both human and rodents for 20 of the regulatory regions, we estimate that 32%-40% of the human functional sites are not functional in rodents. This is evidence that there is widespread turnover of transcription factor binding sites. These results have important implications for the efficacy of phylogenetic footprinting and the interpretation of the pattern of evolution in regulatory sequences.

[1]  P. V. von Hippel,et al.  Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. , 1987, Journal of molecular biology.

[2]  D. Lillicrap,et al.  Transcriptional control of the factor IX gene: analysis of five cis-acting elements and the deleterious effects of naturally occurring hemophilia B Leyden mutations. , 1994, Blood.

[3]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[4]  Jun Zhu,et al.  Bayesian Adaptive Alignment and Inference , 1997, ISMB.

[5]  W. Miller,et al.  Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome. , 1997, Genome research.

[6]  Francis S. Collins,et al.  Variations on a Theme: Cataloging Human DNA Sequence Variation , 1997, Science.

[7]  M. Boguski,et al.  Evolutionary parameters of the transcribed mammalian genome: an analysis of 2,820 orthologous rodent and human sequences. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Durbin,et al.  Biological Sequence Analysis , 1998 .

[9]  Jun Zhu,et al.  Bayesian adaptive sequence alignment algorithms , 1998, Bioinform..

[10]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[11]  C. Kleeberger,et al.  CCR5 promoter polymorphism and HIV-1 disease progression , 1998, The Lancet.

[12]  F. Ruddle,et al.  Comparative studies on mammalian Hoxc8 early enhancer sequence reveal a baleen whale-specific deletion of a cis-acting element. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[13]  R. Durbin,et al.  Comparative analysis of noncoding regions of 77 orthologous mouse and human gene pairs. , 1999, Genome research.

[14]  A. Fersht,et al.  Mutually compensatory mutations during evolution of the tetramerization domain of tumor suppressor p53 lead to impaired hetero-oligomerization. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Jun Wei,et al.  The NOTCH4 locus is associated with susceptibility to schizophrenia , 2000, Nature Genetics.

[16]  P. Flores-Villanueva,et al.  Identification of phylogenetic footprints in primate tumor necrosis factor-alpha promoters. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[17]  R. Gibbs,et al.  PipMaker--a web server for aligning two genomic DNA sequences. , 2000, Genome research.

[18]  C. Lawrence,et al.  Human-mouse genome comparisons to locate regulatory sites , 2000, Nature Genetics.

[19]  Validating Computer Programs for Functional Genomics in Gene Regulatory Regions , 2000 .

[20]  Xin Chen,et al.  TRANSFAC: an integrated system for gene expression regulation , 2000, Nucleic Acids Res..

[21]  N. Patel,et al.  Evidence for stabilizing selection in a eukaryotic enhancer element , 2000, Nature.

[22]  Wei Zhang,et al.  Association of a promoter polymorphism of tumor necrosis factor-alpha with subacute cutaneous lupus erythematosus and distinct photoregulation of transcription. , 2000, The Journal of investigative dermatology.

[23]  I-Min A. Dubchak,et al.  Active conservation of noncoding sequences revealed by three-way species comparisons. , 2000, Genome research.

[24]  F. Ruddle,et al.  An efficient cis-element discovery method using multiple sequence comparisons based on evolutionary relationships. , 2001, Genomics.

[25]  H. Bussemaker,et al.  Regulatory element detection using correlation with expression , 2001, Nature Genetics.