TFBSshape: a motif database for DNA shape features of transcription factor binding sites

Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.

[1]  Lin Yang,et al.  DNAshape: a method for the high-throughput prediction of DNA structural features on a genomic scale , 2013, Nucleic Acids Res..

[2]  David J. Arenillas,et al.  JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles , 2009, Nucleic Acids Res..

[3]  Renato Ostuni,et al.  Lineages, cell types and functional states: a genomic view. , 2013, Current opinion in cell biology.

[4]  R. Mann,et al.  Origins of specificity in protein-DNA recognition. , 2010, Annual review of biochemistry.

[5]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[6]  Barbara E. Engelhardt,et al.  Stability selection for regression-based models of transcription factor–DNA binding specificity , 2013, Bioinform..

[7]  Stephen C. J. Parker,et al.  DNA shape, genetic codes, and evolution. , 2011, Current opinion in structural biology.

[8]  David J. Arenillas,et al.  JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles , 2013, Nucleic Acids Res..

[9]  M. Bulyk,et al.  Genomic regions flanking E-box binding sites influence DNA binding specificity of bHLH transcription factors through DNA shape. , 2013, Cell reports.

[10]  Remo Rohs,et al.  DNA binding by GATA transcription factor suggests mechanisms of DNA looping and long-range gene regulation. , 2012, Cell reports.

[11]  Daniel E. Newburger,et al.  Variation in Homeodomain DNA Binding Revealed by High-Resolution Analysis of Sequence Preferences , 2008, Cell.

[12]  Timothy R. Hughes,et al.  YeTFaSCo: a database of evaluated yeast transcription factor sequence specificities , 2011, Nucleic Acids Res..

[13]  B. Honig,et al.  Diversity in DNA recognition by p53 revealed by crystal structures with Hoogsteen base pairs , 2010, Nature Structural &Molecular Biology.

[14]  Remo Rohs,et al.  Covariation between homeodomain transcription factors and the shape of their DNA binding sites , 2013, Nucleic acids research.

[15]  R. Sandstrom,et al.  Probing DNA shape and methylation state on a genomic scale with DNase I , 2013, Proceedings of the National Academy of Sciences.

[16]  Fangping Mu,et al.  Improved predictions of transcription factor binding sites using physicochemical features of DNA , 2012, Nucleic acids research.

[17]  Eran Segal,et al.  A Feature-Based Approach to Modeling Protein–DNA Interactions , 2007, RECOMB.

[18]  R. Mann,et al.  Cofactor Binding Evokes Latent Differences in DNA Binding Specificity between Hox Proteins , 2011, Cell.

[19]  F. van Roy,et al.  A flexible integrative approach based on random forest improves prediction of transcription factor binding sites , 2012, Nucleic acids research.

[20]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[21]  Mikael Bodén,et al.  MEME Suite: tools for motif discovery and searching , 2009, Nucleic Acids Res..

[22]  Remo Rohs,et al.  Structural studies of p53 inactivation by DNA-contact mutations and its rescue by suppressor mutations via alternative protein–DNA interactions , 2013, Nucleic acids research.

[23]  G. Stormo,et al.  Improved Models for Transcription Factor Binding Site Identification Using Nonindependent Interactions , 2012, Genetics.

[24]  Michael A. Crickmore,et al.  Functional Specificity of a Hox Protein Mediated by the Recognition of Minor Groove Structure , 2007, Cell.

[25]  Lijiang Yang,et al.  Probing Allostery Through DNA , 2013, Science.

[26]  Remo Rohs,et al.  Control of DNA minor groove width and Fis protein binding by the purine 2-amino group , 2013, Nucleic acids research.

[27]  Remo Rohs,et al.  Structure of p53 binding to the BAX response element reveals DNA unwinding and compression to accommodate base-pair insertion , 2013, Nucleic acids research.

[28]  R. Mann,et al.  The role of DNA shape in protein-DNA recognition , 2009, Nature.

[29]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[30]  Kathleen Marchal,et al.  Use of structural DNA properties for the prediction of transcription-factor binding sites in Escherichia coli , 2010, Nucleic Acids Res..

[31]  A. Philippakis,et al.  Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities , 2006, Nature Biotechnology.

[32]  Gary D. Stormo,et al.  Modeling the specificity of protein-DNA interactions , 2013, Quantitative Biology.

[33]  K. Yamamoto,et al.  The glucocorticoid receptor dimer interface allosterically transmits sequence-specific DNA signals , 2013, Nature Structural &Molecular Biology.

[34]  Saurabh Sinha,et al.  FlyFactorSurvey: a database of Drosophila transcription factor binding specificities determined using the bacterial one-hybrid system , 2010, Nucleic Acids Res..

[35]  Martha L. Bulyk,et al.  UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein–DNA interactions , 2010, Nucleic Acids Res..

[36]  Remo Rohs,et al.  Mechanism of origin DNA recognition and assembly of an initiator-helicase complex by SV40 large tumor antigen. , 2013, Cell reports.

[37]  Atina G. Coté,et al.  Evaluation of methods for modeling transcription factor sequence specificity , 2013, Nature Biotechnology.