BTNET : boosted tree based gene regulatory network inference algorithm using time-course measurement data

BackgroundIdentifying gene regulatory networks is an important task for understanding biological systems. Time-course measurement data became a valuable resource for inferring gene regulatory networks. Various methods have been presented for reconstructing the networks from time-course measurement data. However, existing methods have been validated on only a limited number of benchmark datasets, and rarely verified on real biological systems.ResultsWe first integrated benchmark time-course gene expression datasets from previous studies and reassessed the baseline methods. We observed that GENIE3-time, a tree-based ensemble method, achieved the best performance among the baselines. In this study, we introduce BTNET, a boosted tree based gene regulatory network inference algorithm which improves the state-of-the-art. We quantitatively validated BTNET on the integrated benchmark dataset. The AUROC and AUPR scores of BTNET were higher than those of the baselines. We also qualitatively validated the results of BTNET through an experiment on neuroblastoma cells treated with an antidepressant. The inferred regulatory network from BTNET showed that brachyury, a transcription factor, was regulated by fluoxetine, an antidepressant, which was verified by the expression of its downstream genes.ConclusionsWe present BTENT that infers a GRN from time-course measurement data using boosting algorithms. Our model achieved the highest AUROC and AUPR scores on the integrated benchmark dataset. We further validated BTNET qualitatively through a wet-lab experiment and showed that BTNET can produce biologically meaningful results.

[1]  Guido Sanguinetti,et al.  Combining tree-based and dynamical systems for the inference of gene regulatory networks , 2015, Bioinform..

[2]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[3]  Mehmet Toner,et al.  A high-throughput microfluidic real-time gene expression living cell array. , 2007, Lab on a chip.

[4]  Michele Ceccarelli,et al.  articleTimeDelay-ARACNE : Reverse engineering of gene networks from time-course data by an information theoretic approach , 2010 .

[5]  Richard Bonneau,et al.  DREAM4: Combining Genetic and Dynamic Information to Identify Biological Networks and Dynamical Models , 2010, PloS one.

[6]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[7]  Bin Yan,et al.  DDGni: Dynamic delay gene-network inference from high-temporal data using gapped local alignment , 2014, Bioinform..

[8]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[9]  B. Haibe-Kains,et al.  Gene regulatory networks and their applications: understanding biological and medical problems in terms of networks , 2014, Front. Cell Dev. Biol..

[10]  Tomasz Arodz,et al.  ADANET: inferring gene regulatory networks using ensemble classifiers , 2012, BCB.

[11]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[12]  Hui-Chen Su,et al.  Fluoxetine regulates cell growth inhibition of interferon-α. , 2016, International journal of oncology.

[13]  Richard Bonneau,et al.  The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo , 2006, Genome Biology.

[14]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[15]  D A Hopkinson,et al.  The human homolog T of the mouse T(Brachyury) gene; gene structure, cDNA sequence, and assignment to chromosome 6q27. , 1996, Genome research.

[16]  Hailong Zhu,et al.  Reconstructing dynamic gene regulatory network for the development process of hepatocellular carcinoma , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[17]  Adam A. Margolin,et al.  Reverse engineering cellular networks , 2006, Nature Protocols.

[18]  Alfredo Quinones-Hinojosa,et al.  The FGFR/MEK/ERK/brachyury pathway is critical for chordoma cell growth and survival. , 2014, Carcinogenesis.

[19]  Joshua E. S. Socolar,et al.  Global control of cell-cycle transcription by coupled CDK and network oscillators , 2008, Nature.

[20]  Jeanne M O Eloundou-Mbebi,et al.  Gene regulatory network inference using fused LASSO on multiple data sets , 2016, Scientific Reports.

[21]  Eric C. Mwambene,et al.  Protein interaction networks as metric spaces: a novel perspective on distribution of hubs , 2014, BMC Systems Biology.

[22]  Tomasz Arodz,et al.  ENNET: inferring large gene regulatory networks from expression data using gradient boosting , 2013, BMC Systems Biology.

[23]  R. Waterston,et al.  Multidimensional regulation of gene expression in the C. elegans embryo , 2012, Genome research.

[24]  Alexandre P. Francisco,et al.  YEASTRACT: providing a programmatic access to curated transcriptional regulatory associations in Saccharomyces cerevisiae through a web services interface , 2010, Nucleic Acids Res..

[25]  Nicola J. Rinaldi,et al.  Serial Regulation of Transcriptional Regulators in the Yeast Cell Cycle , 2001, Cell.

[26]  D. Bernardo,et al.  A Yeast Synthetic Network for In Vivo Assessment of Reverse-Engineering and Modeling Approaches , 2009, Cell.

[27]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[28]  Adrian E. Raftery,et al.  Fast Bayesian inference for gene regulatory networks using ScanBMA , 2014, BMC Systems Biology.

[29]  C. Bianco,et al.  Role of Cripto-1 during epithelial-to-mesenchymal transition in development and cancer. , 2012, The American journal of pathology.

[30]  I. Simon,et al.  Studying and modelling dynamic biological processes using time-series gene expression data , 2012, Nature Reviews Genetics.

[31]  Vân Anh Huynh-Thu,et al.  Machine learning-based feature ranking: Statistical interpretation and gene network inference , 2012 .

[32]  Damian Szklarczyk,et al.  The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored , 2010, Nucleic Acids Res..

[33]  Muriel Médard,et al.  Network deconvolution as a general method to distinguish direct dependencies in networks , 2013, Nature Biotechnology.

[34]  Gang Hu,et al.  Fluoxetine protects against IL-1β-induced neuronal apoptosis via downregulation of p53 , 2016, Neuropharmacology.

[35]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[36]  S. Henderson,et al.  Brachyury, a crucial regulator of notochordal development, is a novel biomarker for chordomas , 2006, The Journal of pathology.

[37]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[38]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[39]  Rui Du,et al.  The T-box transcription factor Brachyury promotes renal interstitial fibrosis by repressing E-cadherin expression , 2014, Cell Communication and Signaling.

[40]  Melissa J. Davis,et al.  Gene regulatory network inference: evaluation and application to ovarian cancer allows the prioritization of drug targets , 2012, Genome Medicine.

[41]  Yan Li,et al.  Overexpression of Cathepsin Z Contributes to Tumor Metastasis by Inducing Epithelial-Mesenchymal Transition in Hepatocellular Carcinoma , 2011, PloS one.

[42]  Diego di Bernardo,et al.  Inference of gene regulatory networks and compound mode of action from time course gene expression profiles , 2006, Bioinform..

[43]  Harris Drucker,et al.  Improving Regressors using Boosting Techniques , 1997, ICML.

[44]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[45]  Jingbo Kang,et al.  Overexpression of brachyury contributes to tumor metastasis by inducing epithelial-mesenchymal transition in hepatocellular carcinoma , 2014, Journal of experimental & clinical cancer research : CR.

[46]  W. Kolch,et al.  BGRMI: A method for inferring gene regulatory networks from time-course gene expression data and its application in breast cancer research , 2016, Scientific Reports.

[47]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[48]  Feng Lin,et al.  Highly sensitive inference of time-delayed gene regulation by network deconvolution , 2014, BMC Systems Biology.

[49]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.