Integrative Analysis of Deep Sequencing Data Identifies Estrogen Receptor Early Response Genes and Links ATAD3B to Poor Survival in Breast Cancer

Identification of responsive genes to an extra-cellular cue enables characterization of pathophysiologically crucial biological processes. Deep sequencing technologies provide a powerful means to identify responsive genes, which creates a need for computational methods able to analyze dynamic and multi-level deep sequencing data. To answer this need we introduce here a data-driven algorithm, SPINLONG, which is designed to search for genes that match the user-defined hypotheses or models. SPINLONG is applicable to various experimental setups measuring several molecular markers in parallel. To demonstrate the SPINLONG approach, we analyzed ChIP-seq data reporting PolII, estrogen receptor (), H3K4me3 and H2A.Z occupancy at five time points in the MCF-7 breast cancer cell line after estradiol stimulus. We obtained 777 early responsive genes and compared the biological functions of the genes having binding within 20 kb of the transcription start site (TSS) to genes without such binding site. Our results show that the non-genomic action of via the MAPK pathway, instead of direct binding, may be responsible for early cell responses to activation. Our results also indicate that the responsive genes triggered by the genomic pathway are transcribed faster than those without binding sites. The survival analysis of the 777 responsive genes with 150 primary breast cancer tumors and in two independent validation cohorts indicated the ATAD3B gene, which does not have binding site within 20 kb of its TSS, to be significantly associated with poor patient survival.

[1]  Peter Devilee,et al.  Pathology and Genetics of Tumours of the Breast and Female Genital Organs , 2003 .

[2]  L. Holmberg,et al.  Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts , 2005, Breast Cancer Research.

[3]  W. Wong,et al.  ChIP-Seq of transcription factors predicts absolute and differential gene expression in embryonic stem cells , 2009, Proceedings of the National Academy of Sciences.

[4]  Sushmita Mitra,et al.  Hidden Markov Models, Grammars, and Biology: a Tutorial , 2005, J. Bioinform. Comput. Biol..

[5]  L. Selth,et al.  Transcript Elongation by RNA Polymerase II. , 2010, Annual review of biochemistry.

[6]  T. Misteli,et al.  Transcription dynamics. , 2009, Molecular cell.

[7]  Lorenzo Ferraro,et al.  Estrogen receptor alpha controls a gene network in luminal-like breast cancer cells comprising multiple transcription factors and microRNAs. , 2010, The American journal of pathology.

[8]  P. Hall,et al.  An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[9]  S. Schreiber,et al.  Histone Variant H2A.Z Marks the 5′ Ends of Both Active and Inactive Genes in Euchromatin , 2006, Cell.

[10]  Francisco J. Agosto-Perez,et al.  Genome-wide mapping of RNA Pol-II promoter usage in mouse tissues by ChIP-seq , 2010, Nucleic Acids Res..

[11]  R. Walker,et al.  World Health Organization Classification of Tumours. Pathology and Genetics of Tumours of the Breast and Female Genital Organs , 2005 .

[12]  H. Stunnenberg,et al.  ChIP‐Seq of ERα and RNA polymerase II defines genes differentially responding to ligands , 2009, The EMBO journal.

[13]  B. Suman,et al.  A survey of simulated annealing as a tool for single and multiobjective optimization , 2006, J. Oper. Res. Soc..

[14]  V. Jordan,et al.  Estrogen regulation of X-box binding protein-1 and its role in estrogen induced growth of breast and endometrial cancer cells , 2010, Hormone molecular biology and clinical investigation.

[15]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[16]  Vladimir B. Bajic,et al.  ERGDB: Estrogen Responsive Genes Database , 2004, Nucleic Acids Res..

[17]  J. McNally,et al.  Fast transcription rates of RNA polymerase II in human cells , 2011, EMBO reports.

[18]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer , 2011, Nature Biotechnology.

[19]  Timothy J. Ross,et al.  Fuzzy Logic with Engineering Applications: Ross/Fuzzy Logic with Engineering Applications , 2010 .

[20]  J. Weissman,et al.  Nascent transcript sequencing visualizes transcription at nucleotide resolution , 2011, Nature.

[21]  Leighton J. Core,et al.  Nascent RNA Sequencing Reveals Widespread Pausing and Divergent Initiation at Human Promoters , 2008, Science.

[22]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[23]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[24]  S. Johnston,et al.  New Strategies in Estrogen Receptor–Positive Breast Cancer , 2010, Clinical Cancer Research.

[25]  J. Yager,et al.  Estrogen carcinogenesis in breast cancer. , 2006, The New England journal of medicine.

[26]  Neil J McKenna,et al.  GEMS (Gene Expression MetaSignatures), a Web resource for querying meta-analysis of expression microarray datasets: 17beta-estradiol in MCF-7 cells. , 2009, Cancer research.

[27]  W. Zwart,et al.  Can predictive biomarkers in breast cancer guide adjuvant endocrine therapy? , 2012, Nature Reviews Clinical Oncology.

[28]  Gautier Koscielny,et al.  Ensembl 2012 , 2011, Nucleic Acids Res..

[29]  I. Ellis,et al.  Differential oestrogen receptor binding is associated with clinical outcome in breast cancer , 2011, Nature.

[30]  Raymond K. Auerbach,et al.  PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls , 2009, Nature Biotechnology.

[31]  Zhaohui S. Qin,et al.  HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data , 2010, BMC Bioinformatics.

[32]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[33]  D. Rousseau,et al.  ATAD3, a vital membrane bound mitochondrial ATPase involved in tumor progression , 2012, Journal of Bioenergetics and Biomembranes.

[34]  K. Ovaska,et al.  Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme , 2010, Genome Medicine.

[35]  H. Stunnenberg,et al.  Genome-wide profiling of PPARgamma:RXR and RNA polymerase II occupancy reveals temporal activation of distinct metabolic pathways and changes in RXR dimer composition during adipogenesis. , 2008, Genes & development.

[36]  Aleix Prat Aparicio Comprehensive molecular portraits of human breast tumours , 2012 .

[37]  Hyunsoo Kim,et al.  the transcriptome diversity of cerebellar development Alternative transcription exceeds alternative splicing in generating Material Supplemental , 2011 .

[38]  Leighton J. Core,et al.  A Rapid, Extensive, and Transient Transcriptional Response to Estrogen Signaling in Breast Cancer Cells , 2011, Cell.

[39]  Peter F Stadler,et al.  Chromatin measurements reveal contributions of synthesis and decay to steady-state mRNA levels , 2012 .

[40]  Jérôme Eeckhoute,et al.  Growth factor stimulation induces a distinct ER(alpha) cistrome underlying breast cancer endocrine resistance. , 2010, Genes & development.

[41]  Rory Stark Differential Oestrogen Receptor Binding is Associated with Clinical Outcome in Breast Cancer , 2012, RECOMB.

[42]  Twyla T. Pohar,et al.  ERTargetDB: an integral information resource of transcription regulation of estrogen receptor target genes. , 2005, Journal of molecular endocrinology.

[43]  Laurence Tianruo Yang,et al.  Fuzzy Logic with Engineering Applications , 1999 .