Rationalization of gene regulation by a eukaryotic transcription factor: calculation of regulatory region occupancy from predicted binding affinities.

DNA-binding proteins regulate gene expression by binding preferentially to a set of related sequences. In order to quantify the correlation between gene regulation and the presence of sequence motifs, the affinity of a transcription factor for each variant of the binding site must be known or predicted. In addition, the contribution of multiple binding sites to the regulation of a single gene must be modeled. To predict the affinity of the yeast Leu3 transcription factor for genomic-binding sites, we measured the in vitro equilibrium dissociation constants of 43 binding-site variants and established that the free energy of binding can be approximated as a sum of free energy contributions from each base-pair. This allows the prediction of an equilibrium dissociation constant for all potential binding sites in the genome and, therefore, their fractional occupancy at some assumed concentration of free Leu3. From the occupancy of individual sites, the probability that at least one site is occupied within a defined segment upstream of a gene was calculated for all genes in yeast. We find that this probability is substantially better correlated with regulation by Leu3 than is the number of binding sites. This is true whether the number of binding sites is based on a consensus site definition of the binding site or by enumeration of all variants that have a predicted K(d) value below some threshold. The occupancy calculation was best able to rationalize the Leu3-regulated gene set over a Leu3 concentration range that spans the K(d) values for the best sites.

[1]  J. Fickett,et al.  Identification of regulatory regions which confer muscle-specific gene expression. , 1998, Journal of molecular biology.

[2]  Marcello Pagano,et al.  Principles of Biostatistics , 1992 .

[3]  P. Riggs Expression and Purification of Maltose‐Binding Protein Fusions , 1994, Current protocols in molecular biology.

[4]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[5]  G. Church,et al.  Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors. , 2002, Nucleic acids research.

[6]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[7]  G. Stormo,et al.  Specificity of the Mnt protein determined by binding to randomized operators. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Peter W. Markstein,et al.  Genome-wide analysis of clustered Dorsal binding sites identifies putative target genes in the Drosophila embryo , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Joshua A. Granek,et al.  Rank order metrics for quantifying the association of sequence features with gene regulation , 2003, Bioinform..

[10]  A. Sarai,et al.  Lambda repressor recognizes the approximately 2-fold symmetric half-operator sequences asymmetrically. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[11]  D. S. Fields,et al.  Quantitative specificity of the Mnt repressor. , 1997, Journal of molecular biology.

[12]  G. Rubin,et al.  Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[13]  A. Sarai,et al.  Analysis of the sequence-specific interactions between Cro repressor and operator DNA by systematic base substitution experiments. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[14]  T. Cooper,et al.  The Saccharomyces cerevisiae Leu3 protein activates expression of GDH1, a key gene in nitrogen assimilation , 1995, Molecular and cellular biology.

[15]  J R Desjarlais,et al.  Length-encoded multiplex binding site determination: application to zinc finger proteins. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[16]  P. Schimmel,et al.  LEU3 of Saccharomyces cerevisiae activates multiple genes for branched-chain amino acid biosynthesis by binding to a common decanucleotide core sequence , 1988, Molecular and cellular biology.

[17]  G. Stormo,et al.  Non-independence of Mnt repressor-operator interaction determined by a new quantitative multiple fluorescence relative affinity (QuMFRA) assay. , 2001, Nucleic acids research.

[18]  K. Zhou,et al.  Structure of yeast regulatory gene LEU3 and evidence that LEU3 itself is under general amino acid control , 1987, Nucleic Acids Res..

[19]  T. D. Schneider,et al.  Quantitative analysis of the relationship between nucleotide sequence and functional activity. , 1986, Nucleic acids research.

[20]  R. Planta,et al.  Transcriptional regulation of the Saccharomyces cerevisiae amino acid permease gene BAP2 , 2001, Molecular and General Genetics MGG.

[21]  M. Fried Measurement of protein‐DNA interaction parameters by electrophoresis mobility shift assay , 1989, Electrophoresis.

[22]  J. Sze,et al.  Purification and structural characterization of transcriptional regulator Leu3 of yeast. , 1993, The Journal of biological chemistry.