Statistical Identification of Co-regulatory Gene Modules using Multiple ChIP-Seq Experiments

ChIP-Seq experiments provide accurate measurements of the regulatory roles of transcription factors (TFs) under specific condition. Downstream target genes can be detected by analyzing the enriched TF binding sites (TFBSs) in genes’ promoter regions. The location and statistical information of TFBSs make it possible to evaluate the relative importance of each binding. Based on the assumption that the TFBSs of one ChIP-Seq experiment follow the same specific location distribution, a statistical model is first proposed using both location and significance information of peaks to weigh target genes. With genes’ binding scores from different TFs, we merge them into a weighted binding matrix. A Markov Chain Monte Carlo (MCMC) based approach is then applied to the binding matrix for co-regulatory module identification. We demonstrate the efficiency of our statistical model on an ER-I± ChIP-Seq dataset and further identify co-regulatory modules by using eleven breast cancer related TFs from ENCODE ChIP-Seq datasets. The results show that the TFs in individual module regulate common high score target genes; the association of TFs is biologically meaningful, and the functional roles of TFs and target genes are consistent.

[1]  B. Amati,et al.  Myc-Max-Mad: a transcription factor network controlling cell cycle progression, differentiation and death. , 1994, Current opinion in genetics & development.

[2]  T. Volkert,et al.  E2F integrates cell cycle progression with DNA repair, replication, and G(2)/M checkpoints. , 2002, Genes & development.

[3]  Sven Bergmann,et al.  Defining transcription modules using large-scale gene expression data , 2004, Bioinform..

[4]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[5]  J. Schug,et al.  Genome-Wide Location Analysis Reveals Distinct Transcriptional Circuitry by Paralogous Regulators Foxa1 and Foxa2 , 2012, PLoS genetics.

[6]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[7]  Sarah A. Teichmann,et al.  Assessing Computational Methods of Cis-Regulatory Module Prediction , 2010, PLoS Comput. Biol..

[8]  J. Carroll,et al.  FOXA1 is a critical determinant of Estrogen Receptor function and endocrine response , 2010, Nature Genetics.

[9]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[10]  B. Komm,et al.  Genome-Wide Analysis of Estrogen Receptor α DNA Binding and Tethering Mechanisms Identifies Runx1 as a Novel Tethering Factor in Receptor-Mediated Transcriptional Activation , 2010, Molecular and Cellular Biology.

[11]  C. Klinge,et al.  Anacardic Acid Inhibits Estrogen Receptor α–DNA Binding and Reduces Target Gene Transcription and Breast Cancer Cell Proliferation , 2010, Molecular Cancer Therapeutics.

[12]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[13]  Mark Gerstein,et al.  TIP: A probabilistic method for identifying transcription factor target genes from ChIP-seq binding profiles , 2011, Bioinform..

[14]  Wojtek J. Krzanowski,et al.  Biclustering models for structured microarray data , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  P. Park ChIP–seq: advantages and challenges of a maturing technology , 2009, Nature Reviews Genetics.

[16]  E. Bossy‐Wetzel,et al.  Induction of apoptosis by the transcription factor c‐Jun , 1997, The EMBO journal.

[17]  Heidi Dvinge,et al.  PeakAnalyzer: Genome-wide annotation of chromatin binding and modification loci , 2010, BMC Bioinformatics.

[18]  Martin Renqiang Min,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[19]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[20]  Wilfred W. Li,et al.  MEME: discovering and analyzing DNA and protein sequence motifs , 2006, Nucleic Acids Res..

[21]  David Z. Chen,et al.  Architecture of the human regulatory network derived from ENCODE data , 2012, Nature.

[22]  Sarah L Vowler,et al.  Cooperative interaction between retinoic acid receptor-alpha and estrogen receptor in breast cancer. , 2010, Genes & development.

[23]  M. Bjornsti,et al.  The tor pathway: a target for cancer therapy , 2004, Nature Reviews Cancer.

[24]  Jian Zhang,et al.  Transcription factor specificity protein 1 (SP1) and activating protein 2α (AP‐2α) regulate expression of human KCTD10 gene by binding to proximal region of promoter , 2009, The FEBS journal.

[25]  Jue-min Fang,et al.  Transcription factor specificity protein 1: a new biomarker for pancreatic cancer therapy , 2011 .

[26]  B. Wasylyk,et al.  Sp100 Interacts with ETS-1 and Stimulates Its Transcriptional Activity , 2002, Molecular and Cellular Biology.

[27]  Cory Y. McLean,et al.  GREAT improves functional interpretation of cis-regulatory regions , 2010, Nature Biotechnology.

[28]  R. Dashwood,et al.  Activator Protein 2α Associates with Adenomatous Polyposis Coli/β-Catenin and Inhibits β-Catenin/T-cell Factor Transcriptional Activity in Colorectal Cancer Cells* , 2004, Journal of Biological Chemistry.