Shapely DNA attracts the right partner

All levels of cell activity are coordinated directly or indirectly by transcription factors (TFs). In turn, the functioning of TFs relies on their ability to recognize and bind specific DNA sequences to regulate the expression of specific genes. How exactly this specificity is achieved is still not fully understood. Although often supposed to execute a binary decision to bind or not bind at a target sequence, in much the same way that a restriction enzyme cuts or not at its restriction site, most TFs, rather than binding to a unique sequence, in reality bind with various affinities to a range of related sequences. This molecular recognition is achieved through complementary interactions between protein and DNA surfaces and their functional groups. These interactions must provide enough information both to define the binding site sequence and to discriminate authentic binding sites from a cloud of related sites that might be made accessible by thermal fluctuations (1). To capture these interactions, genome-wide prediction of TF-binding sites and their affinities (or, ideally, binding free energies) rely chiefly on quantitative models based on experimentally/empirically determined or computationally predicted binding sites. Many of these models are mechanistically agnostic, simply exploiting the statistical enrichment of sequences recovered from in vivo or in vitro binding experiments irrespective of the detailed chemistry and physics of site recognition. These quantitative models of TF binding can also be used for predicting disease-causing mutations. The simplest model of TF binding assumes that the preference for any nucleotide within a DNA binding site is independent of the nucleotides in the remaining positions. Such independent position models are typically represented by position weight matrices (PWMs), which report, for each nucleotide at every position, this nucleotide’s contribution to the total TF binding affinity score (2). Although such models have been very successful, they are known to be nonperfect. In PNAS, Zhou et al. (3) show that information about DNA shape can improve TF-binding models significantly.

[1]  High resolution models of transcription factor-DNA affinities improve in vitro and in vivo binding predictions , 2010 .

[2]  A. Philippakis,et al.  Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities , 2006, Nature Biotechnology.

[3]  L. Gold,et al.  Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. , 1990, Science.

[4]  Lin Yang,et al.  DNAshape: a method for the high-throughput prediction of DNA structural features on a genomic scale , 2013, Nucleic Acids Res..

[5]  William Stafford Noble,et al.  High Resolution Models of Transcription Factor-DNA Affinities Improve In Vitro and In Vivo Binding Predictions , 2010, PLoS Comput. Biol..

[6]  Juan M. Vaquerizas,et al.  DNA-Binding Specificities of Human Transcription Factors , 2013, Cell.

[7]  David Levens,et al.  The dynamic response of upstream DNA to transcription-generated torsional stress , 2004, Nature Structural &Molecular Biology.

[8]  Benjamin L. Oakes,et al.  A systematic survey of the Cys2His2 zinc finger DNA-binding landscape , 2015, Nucleic acids research.

[9]  Anirvan M. Sengupta,et al.  A biophysical approach to transcription factor binding site discovery. , 2003, Genome research.

[10]  Yue Zhao,et al.  Inferring Binding Energies from Selected Binding Sites , 2009, PLoS Comput. Biol..

[11]  J. Szostak,et al.  In vitro selection of RNA molecules that bind specific ligands , 1990, Nature.

[12]  T. D. Schneider,et al.  70% efficiency of bistate molecular machines explained by information theory, high dimensional geometry and evolutionary convergence , 2010, Nucleic acids research.

[13]  R. Mann,et al.  The role of DNA shape in protein-DNA recognition , 2009, Nature.

[14]  R. Mann,et al.  Quantitative modeling of transcription factor binding specificities using DNA shape , 2015, Proceedings of the National Academy of Sciences.

[15]  Gary D. Stormo,et al.  Modeling the specificity of protein-DNA interactions , 2013, Quantitative Biology.

[16]  T. D. Schneider,et al.  Characterization of Translational Initiation Sites in E. Coui , 1982 .

[17]  G. Stormo,et al.  Combining SELEX with quantitative assays to rapidly obtain accurate models of protein–DNA interactions , 2005, Nucleic acids research.

[18]  Atina G. Coté,et al.  Evaluation of methods for modeling transcription factor sequence specificity , 2013, Nature Biotechnology.