Condition specific transcription factor binding site characterization in Saccharomyces cerevisiae

MOTIVATION We demonstrate a computational process by which transcription factor binding sites can be elucidated using genome-wide expression and binding profiles. The profiles direct us to the intergenic locations likely to contain the promoter regions for a given factor. These sequences are multiply and locally aligned to give an anchor motif from which further characterization can take place. RESULTS We present bases for and assumptions about the variability within these motifs which give rise to potentially more accurate motifs, capture complex binding sites built upon the basis motif, and eliminate the constraints of the currently employed promoter searching protocols. We also present a measure of motif quality based on the occurrence of the putative motifs in regions observed to contain the binding sites. The assumptions, motif generation, quality assessment and comparison allow the user as much control as their a priori knowledge allows. AVAILABILITY IGRDB and the datasets mentioned herein are available at http://chipdb.wi.mit.edu/

[1]  Michael Carey,et al.  DNA recognition by GAL4: structure of a protein-DNA complex , 1992, Nature.

[2]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[3]  D. Brenner,et al.  Techniques to measure nucleic acid-protein binding and specificity. Nuclear extract preparations, DNase I footprinting, and mobility shift assays. , 2001, Methods in molecular biology.

[4]  Michael Q. Zhang,et al.  SCPD: a promoter database of the yeast Saccharomyces cerevisiae , 1999, Bioinform..

[5]  G. Church,et al.  Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation , 1998, Nature Biotechnology.

[6]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[7]  T. Werner,et al.  A novel method to develop highly specific models for regulatory units detects a new LTR in GenBank which contains a functional promoter. , 1997, Journal of molecular biology.

[8]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[9]  Douglas L. Brutlag,et al.  BioProspector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes , 2000, Pacific Symposium on Biocomputing.

[10]  G. Church,et al.  Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. , 2000, Journal of molecular biology.

[11]  Thomas Werner,et al.  Functional promoter modules can be detected by formal models independent of overall nucleotide sequence similarity , 1999, Bioinform..

[12]  Xin Chen,et al.  The TRANSFAC system on gene expression regulation , 2001, Nucleic Acids Res..

[13]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[14]  Dan S. Prestridge,et al.  SIGNAL SCAN: a computer program that scans DNA sequences for eukaryotic transcriptional elements , 1991, Comput. Appl. Biosci..

[15]  Graziano Pesole,et al.  PatSearch: a pattern matcher software that finds functional elements in nucleotide and protein sequences and assesses their statistical significance , 2000, Bioinform..