Genome-wide histone acetylation data improve prediction of mammalian transcription factor binding sites

Motivation: Histone acetylation (HAc) is associated with open chromatin, and HAc has been shown to facilitate transcription factor (TF) binding in mammalian cells. In the innate immune system context, epigenetic studies strongly implicate HAc in the transcriptional response of activated macrophages. We hypothesized that using data from large-scale sequencing of a HAc chromatin immunoprecipitation assay (ChIP-Seq) would improve the performance of computational prediction of binding locations of TFs mediating the response to a signaling event, namely, macrophage activation. Results: We tested this hypothesis using a multi-evidence approach for predicting binding sites. As a training/test dataset, we used ChIP-Seq-derived TF binding site locations for five TFs in activated murine macrophages. Our model combined TF binding site motif scanning with evidence from sequence-based sources and from HAc ChIP-Seq data, using a weighted sum of thresholded scores. We find that using HAc data significantly improves the performance of motif-based TF binding site prediction. Furthermore, we find that within regions of high HAc, local minima of the HAc ChIP-Seq signal are particularly strongly correlated with TF binding locations. Our model, using motif scanning and HAc local minima, improves the sensitivity for TF binding site prediction by ∼50% over a model based on motif scanning alone, at a false positive rate cutoff of 0.01. Availability: The data and software source code for model training and validation are freely available online at http://magnet.systemsbiology.net/hac. Contact: aderem@systemsbiology.org; ishmulevich@systemsbiology.org Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  D. Nelson,et al.  A correlation between nucleosome spacer region susceptibility to DNase I and histone acetylation. , 1979, Nucleic acids research.

[2]  D. McClish Analyzing a Portion of the ROC Curve , 1989, Medical decision making : an international journal of the Society for Medical Decision Making.

[3]  M. Vettese-Dadey,et al.  Acetylation of histone H4 plays a primary role in enhancing transcription factor binding to nucleosomal DNA in vitro. , 1996, The EMBO journal.

[4]  Holger Karas,et al.  TRANSFAC: a database on transcription factors and their DNA binding sites , 1996, Nucleic Acids Res..

[5]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[6]  S. Saccani,et al.  Two Waves of Nuclear Factor κb Recruitment to Target Promoters , 2001, The Journal of experimental medicine.

[7]  R. Tjian,et al.  Transcription regulation and animal diversity , 2003, Nature.

[8]  Mona Singh,et al.  Comparative analysis of methods for representing and searching for transcription factor binding sites , 2004, Bioinform..

[9]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[10]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[11]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[12]  M. Kon,et al.  Integrating genomic data to predict transcription factor binding. , 2005, Genome informatics. International Conference on Genome Informatics.

[13]  Trey Ideker,et al.  Integrated Assessment and Prediction of Transcription Factor Binding , 2006, PLoS Comput. Biol..

[14]  K. Honda,et al.  IRFs: master regulators of signalling by Toll-like receptors and cytosolic pattern-recognition receptors , 2006, Nature Reviews Immunology.

[15]  Y. Hayashizaki,et al.  LPS regulates proinflammatory gene expression in macrophages by altering histone deacetylase expression , 2006, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[16]  A. Hoffmann,et al.  Circuitry of nuclear factor kappaB signaling. , 2006, Immunological reviews.

[17]  Kathleen A. Kennedy,et al.  Systems biology approaches identify ATF3 as a negative regulator of Toll-like receptor 4 , 2006, Nature.

[18]  Yu Liang,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm080 Sequence analysis , 2022 .

[19]  Daniel J. Blankenberg,et al.  28-way vertebrate alignment and conservation track in the UCSC Genome Browser. , 2007, Genome research.

[20]  Allen D. Delaney,et al.  Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing , 2007, Nature Methods.

[21]  S. Berger The complex language of chromatin regulation during transcription , 2007, Nature.

[22]  A. Mortazavi,et al.  Genome-Wide Mapping of in Vivo Protein-DNA Interactions , 2007, Science.

[23]  Anthony P. Fejes,et al.  Genome-wide relationship between histone H3 lysine 4 mono- and tri-methylation and transcription factor binding. , 2008, Genome research.

[24]  T. Furey,et al.  A general integrative genomic feature transcription factor binding site prediction method applied to analysis of USF1 binding in cardiovascular disease , 2009, Human Genomics.

[25]  H. Lähdesmäki,et al.  Probabilistic Inference of Transcription Factor Binding from Multiple Data Sources , 2008, PloS one.

[26]  Arnold Neumaier,et al.  SNOBFIT -- Stable Noisy Optimization by Branch and Fit , 2008, TOMS.

[27]  Bin Li,et al.  Uncovering a Macrophage Transcriptional Program by Integrating Evidence from Motif Scanning and Expression Dynamics , 2008, PLoS Comput. Biol..

[28]  Sridhar Hannenhalli,et al.  Eukaryotic transcription factor binding sites - modeling and integrative search methods , 2008, Bioinform..

[29]  Andreas Prlic,et al.  Ensembl 2008 , 2007, Nucleic Acids Res..

[30]  Damian Smedley,et al.  BioMart – biological queries made easy , 2009, BMC Genomics.

[31]  T. Bailey,et al.  High-throughput chromatin information enables accurate tissue-specific prediction of transcription factor binding sites , 2008, Nucleic acids research.

[32]  Irene K. Moore,et al.  The DNA-encoded nucleosome organization of a eukaryotic genome , 2009, Nature.

[33]  B. Ren,et al.  Genome-wide prediction of transcription factor binding sites using an integrated model , 2010, Genome Biology.

[34]  B. Ren,et al.  An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome , 2009, PloS one.

[35]  W. J. Kent,et al.  The UCSC Genome Browser , 2003, Current protocols in bioinformatics.

[36]  V. Thorsson,et al.  A Data Integration Framework for Prediction of Transcription Factor Targets , 2009, Annals of the New York Academy of Sciences.

[37]  Jason B. Ernst,et al.  Integrating multiple evidence sources to predict transcription factor binding in the human genome. , 2010, Genome research.

[38]  Yuriy L. Orlov,et al.  Blurring of High-Resolution Data Shows that the Effect of Intrinsic Nucleosome Occupancy on Transcription Factor Binding is Mostly Regional, Not Local , 2010, PLoS Comput. Biol..

[39]  Xiaoping Zhou,et al.  A Systems Biology Approach to Transcription Factor Binding Site Prediction , 2010, PloS one.