A stochastic model for the evolution of transcription factor binding site abundance.

Both experimental as well as sequence evolution evidence suggests that transcription factor binding sites can undergo divergence and turnover even when the transcriptional output remains conserved. Furthermore, it is likely that there exist lineage specific differences in the retention rate of binding sites that make it desirable to estimate the rate of acquisition and decay of transcription factor binding sites from comparative sequence data. In this paper we propose a stochastic, phenomenological model for binding site turnover. For a given genomic region we assume a constant rate of binding site origination lambda and a constant per site decay rate of mu. We derived an explicit expression for the conditional probability distribution of the number of binding sites n at time t given n(0) binding sites at t=0. The analytical result was compared to a simulation model and we found that it closely predicts the simulated sequence evolution. We then analyzed a small data set of the number of estrogen response elements (ERE) in mammalian HoxA sequences and showed that the data is broadly consistent with the assumption of a stationary turnover process. A regression of shared EREs over the time since divergence led to an estimate of the half-life time for an ERE in the primate HoxA clusters of about 27 Myr, which corresponds to a per site decay rate of mu approximately 1.3 x 10(-8)/year and a rate of origination of lambda approximately 1.6 x 10(-7)/year. We conclude that the model can be used to estimate the rate of binding site turnover from comparative genomic data.

[1]  Nathan M. Young,et al.  Primate molecular divergence dates. , 2006, Molecular phylogenetics and evolution.

[2]  M. Kreitman,et al.  Evolutionary dynamics of the enhancer region of even-skipped in Drosophila. , 1995, Molecular biology and evolution.

[3]  N. Patel,et al.  Functional analysis of eve stripe 2 enhancer evolution in Drosophila: rules governing conservation and change. , 1998, Development.

[4]  Carl M. Harris,et al.  Fundamentals of queueing theory , 1975 .

[5]  Vladimir B. Bajic,et al.  Dragon ERE Finder version 2: a tool for accurate detection and analysis of estrogen response elements in vertebrate genomes , 2003, Nucleic Acids Res..

[6]  D. Tautz Evolution of transcriptional regulation. , 2000, Current opinion in genetics & development.

[7]  Sonja J. Prohaska,et al.  Surveying phylogenetic footprints in large gene clusters: applications to Hox cluster duplications. , 2004, Molecular phylogenetics and evolution.

[8]  R. R. P. Jackson,et al.  Letter to the Editor - The Time-Dependent Solution to the Many-Server Poisson Queue , 1966, Oper. Res..

[9]  Thomas L. Saaty,et al.  Time-Dependent Solution of the Many-Server Poisson Queue , 1960 .

[10]  Boris Lenhard,et al.  Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes , 2004, BMC Genomics.

[11]  J. Brookfield,et al.  Expected rates and modes of evolution of enhancer sequences. , 2004, Molecular biology and evolution.

[12]  Wen-Hsiung Li,et al.  Fundamentals of molecular evolution , 1990 .

[13]  S. Fisher,et al.  Conservation of RET Regulatory Function from Human to Zebrafish Without Sequence Similarity , 2006, Science.

[14]  M. Ludwig,et al.  Functional evolution of noncoding DNA. , 2002, Current opinion in genetics & development.

[15]  M. Blanchette,et al.  Discovery of regulatory elements by a computational method for phylogenetic footprinting. , 2002, Genome research.

[16]  Ken Dewar,et al.  Molecular evolution of the HoxA cluster in the three major gnathostome lineages , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  A. Clark,et al.  Evolution of transcription factor binding sites in Mammalian gene regulatory regions: conservation and turnover. , 2002, Molecular biology and evolution.

[18]  R. Nielsen,et al.  Detecting Selection in Noncoding Regions of Nucleotide Sequences , 2004, Genetics.

[19]  Sonja J. Prohaska,et al.  Divergence of conserved non-coding sequences: rate estimates and relative rate tests. , 2004, Molecular biology and evolution.

[20]  J. Medhi,et al.  Stochastic models in queueing theory , 1991 .

[21]  Matthew W. Hahn,et al.  The evolution of transcriptional regulation in eukaryotes. , 2003, Molecular biology and evolution.

[22]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[23]  J. Fickett,et al.  Discovery and modeling of transcriptional regulatory regions. , 2000, Current opinion in biotechnology.

[24]  E. Davidson The Regulatory Genome: Gene Regulatory Networks In Development And Evolution , 2006 .

[25]  M. Goodman,et al.  Embryonic ε and γ globin genes of a prosimian primate (Galago crassicaudatus): Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints , 1988 .

[26]  Gregory A. Wray,et al.  Conservation of Endo16 expression in sea urchins despite evolutionary divergence in both cis and trans-acting components of transcriptional regulation , 2003, Development.

[27]  J. Tena,et al.  A functional survey of the enhancer activity of conserved non-coding sequences from vertebrate Iroquois cluster gene deserts. , 2005, Genome research.

[28]  Martin Crowder,et al.  An Introduction to Stochastic Modelling , 1984 .

[29]  M. Lässig,et al.  Evolutionary population genetics of promoters: predicting binding sites and functional phylogenies. , 2005, Proceedings of the National Academy of Sciences of the United States of America.