Incorporating the Rate of Transcriptional Change Improves Construction of Gene Regulatory Networks

Transcriptional regulatory networks (TRNs) can be developed by computational approaches that infer regulator-target gene interactions from transcriptional assays. Successful algorithms that generate predictive, accurate TRNs enable the identification of regulator-target relationships in conditions where experimentally determining regulatory interactions is a challenge. Improving the ability of TRNs to successfully predict known regulator-target relationships in model species will enhance confidence in applying these approaches to determine regulator-target interactions in non-model species where experimental validation is challenging. Many transcriptional profiling experiments are performed across multiple time points; therefore we sought to improve regulator-target predictions by adjusting how time is incorporated into the network. We created ExRANGES, which incorporates Expression in a Rate-Normalized GEne Specific manner that adjusts how expression data is provided to the network algorithm. We tested this on a two different network construction approaches and found that ExRANGES prioritizes targets differently than traditional expression and improves the ability of these networks to accurately predict known regulator targets. ExRANGES improved the ability to correctly identify targets of transcription factors in large data sets in four different model systems: mouse, human, Arabidopsis, and yeast. Finally, we examined the performance of ExRANGES on a small data set from field-grown Oryza sativa and found that it also improved the ability to identify known targets even with a limited data set. Author Summary In model organisms, the ability to identify direct targets of transcription factors (TFs) via high throughput experimental assays has advanced our understanding of transcriptional regulatory networks and how organisms regulate gene expression. However, for non-model organisms, it remains a challenge to identify TF–target relationships through experimental approaches such as ChIP-Seq, thus limiting the ability to understand regulatory control is limited. Computational approaches to identify regulator-target relationships in silico from easily attainable transcriptional data offer a solution. Most algorithms for identifying gene regulatory networks from time series data weigh the relationship between regulators and putative targets at all time points equally. However, many regulators may control a single target in response to different inputs. Our approach, ExRANGES, focuses on time points where there is a significant change in expression to identify the association between regulators and targets. ExRANGES essentially weights the expression value of each time point by the slope change after that time point, thereby emphasizing the relationship between regulators and targets at the time points when the transcript levels are changing. We show that this change to the way expression data is included into gene regulatory network algorithms improves the identification of regulator-target interactions and we hope this will improve in silico identification of regulatory relationships in many species.

[1]  J. Darnell STATs and gene regulation. , 1997, Science.

[2]  Colleen J Doherty,et al.  Circadian control of global gene expression patterns. , 2010, Annual review of genetics.

[3]  E. M. Farré,et al.  Direct regulation of abiotic responses by the Arabidopsis circadian clock component PRR7. , 2013, The Plant journal : for cell and molecular biology.

[4]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[5]  I. Simon,et al.  Studying and modelling dynamic biological processes using time-series gene expression data , 2012, Nature Reviews Genetics.

[6]  Vân Anh Huynh-Thu,et al.  Machine learning-based feature ranking: Statistical interpretation and gene network inference , 2012 .

[7]  Tao Zhang,et al.  Genome-Wide Identification of Regulatory DNA Elements and Protein-Binding Footprints Using Signatures of Open Chromatin in Arabidopsis[C][W][OA] , 2012, Plant Cell.

[8]  J. Darnell,et al.  The JAK-STAT pathway at twenty. , 2012, Immunity.

[9]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[10]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[11]  A. Ridley,et al.  Regulators and effectors of Small Gtpases: Rho Family , 2006 .

[12]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[13]  J. Takahashi,et al.  ChIP-seq and RNA-seq methods to study circadian control of transcription in mammals. , 2015, Methods in enzymology.

[14]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[15]  Hung-Chung Huang,et al.  Nobuya Koike Circadian Clock in Mammals Transcriptional Architecture and Chromatin Landscape of the Core , 2013 .

[16]  J. Darnell,et al.  Jak-STAT pathways and transcriptional activation in response to IFNs and other extracellular signaling proteins. , 1994, Science.

[17]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[18]  A. Oudenaarden,et al.  Cellular Decision Making and Biological Noise: From Microbes to Mammals , 2011, Cell.

[19]  Angel D. Pizarro,et al.  CircaDB: a database of mammalian circadian gene expression profiles , 2012, Nucleic Acids Res..

[20]  Kun He,et al.  An Arabidopsis Transcriptional Regulatory Map Reveals Distinct Functional and Evolutionary Features of Novel Transcription Factors , 2015, Molecular biology and evolution.

[21]  Daniel E. Newburger,et al.  High-resolution DNA-binding specificity analysis of yeast transcription factors. , 2009, Genome research.

[22]  Mark Gerstein,et al.  Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data , 2003, Bioinform..

[23]  Alfred O. Hero,et al.  An individualized predictor of health and disease using paired reference and target samples , 2016, BMC Bioinformatics.

[24]  Dawn H. Nagel,et al.  Genome-wide identification of CCA1 targets uncovers an expanded clock network in Arabidopsis , 2015, Proceedings of the National Academy of Sciences.

[25]  Richard Bonneau,et al.  EGRINs (Environmental Gene Regulatory Influence Networks) in Rice That Function in the Response to Water Deficit, High Temperature, and Agricultural Environments[OPEN] , 2016, Plant Cell.

[26]  Christopher A. Penfold,et al.  Nonparametric Bayesian inference for perturbed and orthologous gene regulatory networks , 2012, Bioinform..

[27]  M. Goldsmith,et al.  JAK/STAT signaling by cytokine receptors. , 1998, Current opinion in immunology.

[28]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Hui Liu,et al.  AnimalTFDB: a comprehensive animal transcription factor database , 2011, Nucleic Acids Res..

[30]  Richard Bonneau,et al.  Robust data-driven incorporation of prior knowledge into the inference of dynamic regulatory networks , 2013, Bioinform..

[31]  Hongyu Zhao,et al.  Analysis of Transcription Factor HY5 Genomic Binding Sites Revealed Its Hierarchical Role in Light Regulation of Development[W] , 2007, The Plant Cell Online.

[32]  P. Park ChIP–seq: advantages and challenges of a maturing technology , 2009, Nature Reviews Genetics.

[33]  M. Bansal,et al.  Running Title : OsMADS 1 DNA binding and floret gene regulation , 2016 .

[34]  C. Horvath,et al.  STAT proteins and transcriptional responses to extracellular signals. , 2000, Trends in biochemical sciences.

[35]  Aviv Regev,et al.  Comparative analysis of gene regulatory networks: from network reconstruction to evolution. , 2015, Annual review of cell and developmental biology.

[36]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[37]  Christopher A. Penfold,et al.  CSI: a nonparametric Bayesian approach to network inference from multiple perturbed time series gene expression data , 2015, Statistical applications in genetics and molecular biology.

[38]  Tao Liu,et al.  CistromeMap: a knowledgebase and web server for ChIP-Seq and DNase-Seq studies in mouse and human , 2012, Bioinform..

[39]  Zheng Li,et al.  Large-scale dynamic gene regulatory network inference combining differential equation models with local dynamic Bayesian network analysis , 2011, Bioinform..

[40]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[41]  Connor W. McEntee,et al.  Network Discovery Pipeline Elucidates Conserved Time-of-Day–Specific cis-Regulatory Modules , 2007, PLoS genetics.

[42]  Xing Wang Deng Faculty Opinions recommendation of Network discovery pipeline elucidates conserved time-of-day-specific cis-regulatory modules. , 2008 .

[43]  Mathew G. Lewsey,et al.  Cistrome and Epicistrome Features Shape the Regulatory DNA Landscape , 2016, Cell.

[44]  Takuji Sasaki,et al.  The map-based sequence of the rice genome , 2005, Nature.

[45]  Ziv Bar-Joseph,et al.  Temporal transcriptional response to ethylene gas drives growth hormone cross-regulation in Arabidopsis , 2013, eLife.

[46]  Xuemei Chen,et al.  Orchestration of the Floral Transition and Floral Development in Arabidopsis by the Bifunctional Transcription Factor APETALA2[W][OA] , 2010, Plant Cell.

[47]  I. Amit,et al.  Sequential feedback induction stabilizes the phosphate starvation response in budding yeast. , 2014, Cell reports.

[48]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[49]  J. Bromberg,et al.  STAT proteins: signal tranducers and activators of transcription. , 2001, Methods in enzymology.

[50]  Shane J. Neph,et al.  Mapping and dynamics of regulatory DNA and transcription factor networks in A. thaliana. , 2014, Cell reports.

[51]  Richard Bonneau,et al.  The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo , 2006, Genome Biology.

[52]  Clifford A. Meyer,et al.  Cistrome: an integrative platform for transcriptional regulation studies , 2011, Genome Biology.