Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities

Transcription factors (TFs) interact with specific DNA regulatory sequences to control gene expression throughout myriad cellular processes. However, the DNA binding specificities of only a small fraction of TFs are sufficiently characterized to predict the sequences that they can and cannot bind. We present a maximally compact, synthetic DNA sequence design for protein binding microarray (PBM) experiments that represents all possible DNA sequence variants of a given length k (that is, all 'k-mers') on a single, universal microarray. We constructed such all k-mer microarrays covering all 10–base pair (bp) binding sites by converting high-density single-stranded oligonucleotide arrays to double-stranded (ds) DNA arrays. Using these microarrays we comprehensively determined the binding specificities over a full range of affinities for five TFs of different structural classes from yeast, worm, mouse and human. The unbiased coverage of all k-mers permits high-throughput interrogation of binding site preferences, including nucleotide interdependencies, at unprecedented resolution.

[1]  S. Bjerve,et al.  Error Bounds for Linear Combinations of Order Statistics , 1977 .

[2]  P. V. von Hippel,et al.  Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. , 1987, Journal of molecular biology.

[3]  D. Nathans,et al.  DNA binding site of the growth factor-inducible protein Zif268. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Defining the sequence specificity of DNA-binding proteins by selecting binding sites from random-sequence oligonucleotides: analysis of yeast GCN4 protein. , 1989, Molecular and cellular biology.

[5]  Michael Levine,et al.  Binding affinities and cooperative interactions with bHLH activators delimit threshold responses to the dorsal gradient morphogen , 1993, Cell.

[6]  Juli D. Klemm,et al.  Crystal structure of the Oct-1 POU domain bound to an octamer site: DNA recognition with tethered DNA-binding modules , 1994, Cell.

[7]  The yeast centromere CDEI/Cpf1 complex: differences between in vitro binding and in vivo function. , 1994, Nucleic acids research.

[8]  A. Fire,et al.  The Caenorhabditis elegans NK-2 class homeoprotein CEH-22 is involved in combinatorial activation of gene expression in pharyngeal muscle. , 1994, Development.

[9]  D. Myszka,et al.  Kinetic analysis of macromolecular interactions using surface plasmon resonance biosensors. , 1997, Current opinion in biotechnology.

[10]  M. Sussman,et al.  Maskless fabrication of light-directed oligonucleotide microarrays using a digital micromirror array , 1999, Nature Biotechnology.

[11]  C. Pabo,et al.  Rearrangement of side-chains in a Zif268 mutant highlights the complexities of zinc finger-DNA recognition. , 2001, Journal of molecular biology.

[12]  David Botstein,et al.  Promoter-specific binding of Rap1 revealed by genome-wide maps of protein–DNA association , 2001, Nature Genetics.

[13]  G. Church,et al.  Exploring the DNA-binding specificities of zinc fingers with DNA microarrays , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  G. Church,et al.  Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors. , 2002, Nucleic acids research.

[15]  S. Mango,et al.  Regulation of Organogenesis by the Caenorhabditis elegans FoxA Protein PHA-4 , 2002, Science.

[16]  Yanhui Hu,et al.  Proteome-scale purification of human proteins from bacteria , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  John Aach,et al.  Measuring absolute expression with microarrays with a calibrated reference sample and an extended signal intensity range , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[18]  J. Mackay,et al.  Pentaprobe: a comprehensive sequence for the one-step detection of DNA-binding activities. , 2003, Nucleic acids research.

[19]  R. Young,et al.  Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays , 2004, Nature Genetics.

[20]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[21]  Jiannis Ragoussis,et al.  Quantitative high-throughput analysis of transcription factor binding specificities. , 2004, Nucleic acids research.

[22]  Pilar Blancafort,et al.  Designing Transcription Factor Architectures for Drug Discovery , 2004, Molecular Pharmacology.

[23]  Wyeth W. Wasserman,et al.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles , 2004, Nucleic Acids Res..

[24]  Gary D. Stormo,et al.  Quantitative analysis of EGR proteins binding to DNA: assessing additivity in both the binding site and the protein , 2005, BMC Bioinformatics.

[25]  S. Elledge,et al.  MAGIC, an in vivo genetic method for the rapid construction of recombinant DNA molecules , 2005, Nature Genetics.

[26]  Christopher L. Warren,et al.  Defining the sequence-recognition profile of DNA-binding molecules. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Amos Tanay,et al.  Extensive low-affinity transcriptional interactions in the yeast genome. , 2006, Genome research.

[28]  M. Berger,et al.  Protein binding microarrays (PBMs) for rapid, high-throughput characterization of the sequence specificities of DNA binding proteins. , 2006, Methods in molecular biology.

[29]  Anthony A. Philippakis,et al.  Expression-Guided In Silico Evaluation of Candidate Cis Regulatory Codes for Drosophila Muscle Founder Cells , 2006, PLoS Comput. Biol..