Target analysis by integration of transcriptome and ChIP-seq data with BETA

The combination of ChIP-seq and transcriptome analysis is a compelling approach to unravel the regulation of gene expression. Several recently published methods combine transcription factor (TF) binding and gene expression for target prediction, but few of them provide an efficient software package for the community. Binding and expression target analysis (BETA) is a software package that integrates ChIP-seq of TFs or chromatin regulators with differential gene expression data to infer direct target genes. BETA has three functions: (i) to predict whether the factor has activating or repressive function; (ii) to infer the factor's target genes; and (iii) to identify the motif of the factor and its collaborators, which might modulate the factor's activating or repressive function. Here we describe the implementation and features of BETA to demonstrate its application to several data sets. BETA requires ∼1 GB of RAM, and the procedure takes 20 min to complete. BETA is available open source at http://cistrome.org/BETA/.

[1]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[2]  Jun S. Song,et al.  CCCTC-binding factor confines the distal action of estrogen receptor. , 2008, Cancer research.

[3]  Yong Zhang,et al.  Identifying ChIP-seq enrichment using MACS , 2012, Nature Protocols.

[4]  Clifford A. Meyer,et al.  Model-based analysis of tiling-arrays for ChIP-chip , 2006, Proceedings of the National Academy of Sciences.

[5]  Brad T. Sherman,et al.  DAVID gene ID conversion tool , 2008, Bioinformation.

[6]  S. Balk,et al.  Androgen receptor-associated protein complex binds upstream of the androgen-responsive elements in the promoters of human prostate-specific antigen and kallikrein 2 genes. , 1997, Nucleic acids research.

[7]  David Haussler,et al.  The UCSC Genome Browser database: update 2010 , 2009, Nucleic Acids Res..

[8]  Mark Gerstein,et al.  Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data , 2003, Bioinform..

[9]  Rafael A. Irizarry,et al.  Bioinformatics and Computational Biology Solutions using R and Bioconductor , 2005 .

[10]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[11]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[12]  J. Lieb,et al.  A chromatin-mediated mechanism for specification of conditional transcription factor targets , 2006, Nature Genetics.

[13]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[14]  Zhaohui S. Qin,et al.  On the detection and refinement of transcription factor binding sites using ChIP-Seq data , 2010, Nucleic acids research.

[15]  W. Wong,et al.  The analysis of ChIP-Seq data. , 2011, Methods in enzymology.

[16]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[17]  David Bryant,et al.  DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists , 2007, Nucleic Acids Res..

[18]  A. Houtsmuller,et al.  A bioinformatics-based functional analysis shows that the specifically androgen-regulated gene SARG contains an active direct repeat androgen response element in the first intron. , 2004, Journal of molecular endocrinology.

[19]  Clifford A. Meyer,et al.  Cistrome: an integrative platform for transcriptional regulation studies , 2011, Genome Biology.

[20]  Clifford A. Meyer,et al.  Genome-wide analysis of estrogen receptor binding sites , 2006, Nature Genetics.

[21]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[22]  Cory Y. McLean,et al.  GREAT improves functional interpretation of cis-regulatory regions , 2010, Nature Biotechnology.

[23]  Juri Rappsilber,et al.  TET1 and hydroxymethylcytosine in transcription and DNA methylation fidelity , 2011, Nature.

[24]  Nir Friedman,et al.  A Novel Bayesian DNA Motif Comparison Method for Clustering and Retrieval , 2008, PLoS Comput. Biol..

[25]  R. Myers,et al.  Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data , 2005, Nucleic acids research.

[26]  S. Balk,et al.  Reactivation of androgen receptor-regulated TMPRSS2:ERG gene expression in castration-resistant prostate cancer. , 2009, Cancer research.

[27]  Antti Honkela,et al.  Model-based method for transcription factor target identification with limited data , 2010, Proceedings of the National Academy of Sciences.

[28]  Mark Gerstein,et al.  TIP: A probabilistic method for identifying transcription factor target genes from ChIP-seq binding profiles , 2011, Bioinform..

[29]  Esko Ukkonen,et al.  MOODS: fast search for position weight matrix matches in DNA sequences , 2009, Bioinform..

[30]  Megan F. Cole,et al.  Core Transcriptional Regulatory Circuitry in Human Embryonic Stem Cells , 2005, Cell.

[31]  Jie Zhou,et al.  Discovering transcription factor regulatory targets using gene expression and binding data , 2012, Bioinform..

[32]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Michael Q. Zhang,et al.  ChIP-Array: combinatory analysis of ChIP-seq/chip and microarray gene expression data to discover direct/indirect targets of a transcription factor , 2011, Nucleic Acids Res..

[34]  Joachim Selbig,et al.  Transcription factor target prediction using multiple short expression time series from Arabidopsis thaliana , 2007, BMC Bioinformatics.

[35]  Rainer Breitling,et al.  Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments , 2004, FEBS letters.

[36]  Qian Wang,et al.  A comprehensive view of nuclear receptor cancer cistromes. , 2011, Cancer research.

[37]  Michael Q. Zhang,et al.  TRED: a transcriptional regulatory element database, new entries and other development , 2007, Nucleic Acids Res..

[38]  K. Pienta,et al.  A hierarchical network of transcription factors governs androgen receptor-dependent prostate cancer growth. , 2007, Molecular cell.

[39]  Debra L. Fulton,et al.  TFCat: the curated catalog of mouse and human transcription factors , 2009, Genome Biology.

[40]  Chen Zeng,et al.  A clustering approach for identification of enriched domains from histone modification ChIP-Seq data , 2009, Bioinform..

[41]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[42]  Felix Naef,et al.  Computational analysis of protein-DNA interactions from ChIP-seq data. , 2012, Methods in molecular biology.

[43]  R. Gentleman,et al.  Differential genomic targeting of the transcription factor TAL1 in alternate haematopoietic lineages , 2010, The EMBO journal.