Comparative analysis of differential gene expression tools for RNA sequencing time course data

Abstract RNA sequencing (RNA‐seq) has become a standard procedure to investigate transcriptional changes between conditions and is routinely used in research and clinics. While standard differential expression (DE) analysis between two conditions has been extensively studied, and improved over the past decades, RNA‐seq time course (TC) DE analysis algorithms are still in their early stages. In this study, we compare, for the first time, existing TC RNA‐seq tools on an extensive simulation data set and validated the best performing tools on published data. Surprisingly, TC tools were outperformed by the classical pairwise comparison approach on short time series (<8 time points) in terms of overall performance and robustness to noise, mostly because of high number of false positives, with the exception of ImpulseDE2. Overlapping of candidate lists between tools improved this shortcoming, as the majority of false‐positive, but not true‐positive, candidates were unique for each method. On longer time series, pairwise approach was less efficient on the overall performance compared with splineTC and maSigPro, which did not identify any false‐positive candidate.

[1]  E. Braga,et al.  Comparative transcriptomics of rice plants under cold, iron, and salt stresses , 2016, Functional & Integrative Genomics.

[2]  Nicolas Le Novère,et al.  Perturbations of PIP3 signalling trigger a global remodelling of mRNA landscape and reveal a transcriptional feedback loop , 2015, Nucleic acids research.

[3]  Satoru Miyano,et al.  Gene set differential analysis of time course expression profiles via sparse estimation in functional logistic model with application to time-dependent biomarker detection. , 2016, Biostatistics.

[4]  Eugenia G. Giannopoulou,et al.  Use of RNA sequencing to evaluate rheumatic disease patients , 2015, Arthritis Research & Therapy.

[5]  Maria K. Jaakkola,et al.  Comparison of methods to detect differentially expressed genes between single-cell populations , 2016, Briefings Bioinform..

[6]  M. Robinson,et al.  A scaling normalization method for differential expression analysis of RNA-seq data , 2010, Genome Biology.

[7]  W. Kolch,et al.  BGRMI: A method for inferring gene regulatory networks from time-course gene expression data and its application in breast cancer research , 2016, Scientific Reports.

[8]  Ana Conesa,et al.  Next maSigPro: updating maSigPro bioconductor package for RNA-seq time series , 2014, Bioinform..

[9]  Ziv Bar-Joseph,et al.  SMARTS: reconstructing disease response networks from multiple individuals using time series gene expression data , 2015, Bioinform..

[10]  Jie Zhou,et al.  RNA-seq differential expression studies: more sequence or more replication? , 2014, Bioinform..

[11]  B. Di Camillo,et al.  FunPat: function-based pattern analysis on RNA-seq time series data , 2015, BMC Genomics.

[12]  Reinhard Guthke,et al.  Computational prediction of molecular pathogen-host interactions based on dual transcriptome data , 2015, Front. Microbiol..

[13]  Juho Rousu,et al.  Non-Stationary Gaussian Process Regression with Hamiltonian Monte Carlo , 2015, AISTATS.

[14]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[15]  Matko Bosnjak,et al.  REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms , 2011, PloS one.

[16]  Neil D. Lawrence,et al.  Fast Nonparametric Clustering of Structured Time-Series , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Peter H. Sudmant,et al.  Meta-analysis of RNA-seq expression data across species, tissues and studies , 2015, Genome Biology.

[18]  Yuan Li,et al.  EBSeq-HMM: a Bayesian approach for identifying gene-expression changes in ordered RNA-seq experiments , 2015, Bioinform..

[19]  Christopher B. Burge,et al.  Methods for time series analysis of RNA-seq data with application to human Th17 cell differentiation , 2014, Bioinform..

[20]  B. Oliver,et al.  Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster , 2016, BMC Genomics.

[21]  Charlotte Soneson,et al.  iCOBRA: open, reproducible, standardized and live method benchmarking , 2015, Nature Methods.

[22]  T. Blauwkamp,et al.  Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events , 2015, Nature Biotechnology.

[23]  M. Gerstein,et al.  The Transcriptional Landscape of the Yeast Genome Defined by RNA Sequencing , 2008, Science.

[24]  W. Gilbert,et al.  Messenger RNA modifications: Form, distribution, and function , 2016, Science.

[25]  Boris P. Hejblum,et al.  Time-Course Gene Set Analysis for Longitudinal Gene Expression Data , 2015, PLoS Comput. Biol..

[26]  Marie Kodedová,et al.  Chemosensitization of multidrug resistant Candida albicans by the oxathiolone fused chalcone derivatives , 2015, Front. Microbiol..

[27]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[28]  Alyssa C. Frazee,et al.  ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets , 2011, BMC Bioinformatics.

[29]  Daniel Spies,et al.  Dynamics in Transcriptomics: Advancements in RNA-seq Time Course and Downstream Analysis , 2015, Computational and structural biotechnology journal.

[30]  Di Wu,et al.  Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model , 2016, BMC Bioinformatics.

[31]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[32]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[33]  R. Spielman,et al.  Polymorphic Cis- and Trans-Regulation of Human Gene Expression , 2010, PLoS biology.

[34]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[35]  Fabian J Theis,et al.  Impulse model-based differential expression analysis of time course sequencing data , 2017, bioRxiv.

[36]  Daniel J. Gaffney,et al.  A survey of best practices for RNA-seq data analysis , 2016, Genome Biology.

[37]  F. Ishikawa,et al.  Fission Yeast Pot1-Tpp1 Protects Telomeres and Regulates Telomere Length , 2008, Science.

[38]  Magnus Rattray,et al.  Inferring the perturbation time from biological time course data , 2016, Bioinform..

[39]  S. Fuqua,et al.  RNA sequencing of cancer reveals novel splicing alterations , 2013, Scientific Reports.

[40]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[41]  S. Ye,et al.  ◾ RNA-Seq Data Analysis , 2016 .

[42]  Javier De Las Rivas,et al.  Functional Gene Networks: R/Bioc package to generate and analyse gene networks derived from functional enrichment and clustering , 2013, Bioinform..

[43]  Fabio Stella,et al.  Continuous time Bayesian networks identify Prdm1 as a negative regulator of TH17 cell differentiation in humans , 2016, Scientific Reports.

[44]  Profiling and bioinformatics analyses reveal differential circular RNA expression in radioresistant esophageal cancer cells , 2016, Journal of Translational Medicine.

[45]  B. Huang,et al.  A Linear Mixed Model Spline Framework for Analysing Time Course ‘Omics’ Data , 2015, PloS one.

[46]  Nils Blüthgen,et al.  Natural Cubic Spline Regression Modeling Followed by Dynamic Network Reconstruction for the Identification of Radiation-Sensitivity Gene Association Networks from Time-Course Transcriptome Data , 2016, PloS one.

[47]  Hui Li,et al.  Comparative assessment of methods for the fusion transcripts detection from RNA-Seq data , 2016, Scientific Reports.

[48]  B. Di Camillo,et al.  Measuring differential gene expression with RNA-seq: challenges and strategies for data analysis. , 2015, Briefings in functional genomics.

[49]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[50]  Jeffrey T Leek,et al.  Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown , 2016, Nature Protocols.