ChEA2: Gene-Set Libraries from ChIP-X Experiments to Decode the Transcription Regulome

ChIP-seq experiments provide a plethora of data regarding transcription regulation in mammalian cells. Integrating ChIP-seq studies into a computable resource is potentially useful for further knowledge extraction from such data. We continually collect and expand a database where we convert results from ChIP-seq experiments into gene-set libraries. The manual portion of this database currently contains 200 transcription factors from 221 publications for a total of 458,471 transcription-factor/target interactions. In addition, we automatically compiled data from the ENCODE project which includes 920 experiments applied to 44 cell-lines profiling 160 transcription factors for a total of ~1.4 million transcription-factor/target-gene interactions. Moreover, we processed data from the NIH Epigenomics Roadmap project for 27 different types of histone marks in 64 different human cell-lines. All together the data was processed into three simple gene-set libraries where the set label is either a mammalian transcription factor or a histone modification mark in a particular cell line, organism and experiment. Such gene-set libraries are useful for elucidating the experimentally determined transcriptional networks regulating lists of genes of interest using gene-set enrichment analyses. Furthermore, from these three gene-set libraries, we constructed regulatory networks of transcription factors and histone modifications to identify groups of regulators that work together. For example, we found that the Polycomb Repressive Complex 2 (PRC2) is involved with three distinct clusters each interacting with different sets of transcription factors. Notably, the combined dataset is made into web-based application software where users can perform enrichment analyses or download the data in various formats. The open source ChEA2 web-based software and datasets are available freely online at http://amp.pharm.mssm.edu/ChEA2.

[1]  Michael Q. Zhang,et al.  ChIP-Array: combinatory analysis of ChIP-seq/chip and microarray gene expression data to discover direct/indirect targets of a transcription factor , 2011, Nucleic Acids Res..

[2]  Li Chen,et al.  hmChIP: a database and web server for exploring publicly available human and mouse ChIP-seq and ChIP-chip data , 2011, Bioinform..

[3]  Anders Björklund,et al.  TFEB-mediated autophagy rescues midbrain dopamine neurons from α-synuclein toxicity , 2013, Proceedings of the National Academy of Sciences.

[4]  Tao Liu,et al.  CistromeMap: a knowledgebase and web server for ChIP-Seq and DNase-Seq studies in mouse and human , 2012, Bioinform..

[5]  S. Corre,et al.  Upstream stimulating factors: highly versatile stress-responsive transcription factors. , 2005, Pigment cell research.

[6]  Avi Ma'ayan,et al.  Network2Canvas: network visualization on a canvas with enrichment analysis , 2013, Bioinform..

[7]  Avi Ma'ayan,et al.  ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments , 2010, Bioinform..

[8]  Avi Ma'ayan,et al.  Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool , 2013, BMC Bioinformatics.

[9]  Avi Ma'ayan,et al.  Sets2Networks: network inference from repeated observations of sets , 2012, BMC Systems Biology.

[10]  M. Acencio,et al.  HTRIdb: an open-access database for experimentally verified human transcriptional regulation interactions , 2012, BMC Genomics.

[11]  Denis Puthier,et al.  TranscriptomeBrowser 3.0: introducing a new compendium of molecular interactions and a new visualization tool for the study of gene regulatory networks , 2012, BMC Bioinformatics.

[12]  Andreas Holzinger,et al.  On Knowledge Discovery and Interactive Intelligent Visualization of Biomedical Data - Challenges in Human-Computer Interaction & Biomedical Informatics , 2012, DATA.

[13]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[14]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): comprehensive resource for genetics and genomics of the laboratory mouse , 2011, Nucleic Acids Res..

[15]  Chen Zeng,et al.  A clustering approach for identification of enriched domains from histone modification ChIP-Seq data , 2009, Bioinform..

[16]  Tsonwin Hai,et al.  ATF3 and stress responses. , 1999, Gene expression.

[17]  Avi Ma'ayan,et al.  Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases , 2007, BMC Bioinformatics.

[18]  A. Mortazavi,et al.  Computation for ChIP-seq and RNA-seq studies , 2009, Nature Methods.

[19]  Tao Liu,et al.  CistromeFinder for ChIP-seq and DNase-seq data reuse , 2013, Bioinform..