consensusDE: an R package for assessing consensus of multiple RNA-seq algorithms with RUV correction

Extensive evaluation of RNA-seq methods have demonstrated that no single algorithm consistently outperforms all others. Removal of unwanted variation (RUV) has also been proposed as a method for stabilizing differential expression (DE) results. Despite this, it remains a challenge to run multiple RNA-seq algorithms to identify significant differences common to multiple algorithms, whilst also integrating and assessing the impact of RUV into all algorithms. consensusDE was developed to automate the process of identifying significant DE by combining the results from multiple algorithms with minimal user input and with the option to automatically integrate RUV. consensusDE only requires a table describing the sample groups, a directory containing BAM files or preprocessed count tables and an optional transcript database for annotation. It supports merging of technical replicates, paired analyses and outputs a compendium of plots to guide the user in subsequent analyses. Herein, we also assess the ability of RUV to improve DE stability when combined with multiple algorithms through application to real and simulated data. We find that, although RUV demonstrated improved FDR in a setting of low replication, the effect was algorithm specific and diminished with increased replication, reinforcing increased replication for recovery of true DE genes. We finish by offering some rules and considerations for the application of RUV in a consensus-based setting. consensusDE is freely available, implemented in R and available as a Bioconductor package, under the GPL-3 license, along with a comprehensive vignette describing functionality: http://bioconductor.org/packages/consensusDE/

[2]  Charlotte Soneson,et al.  A comparison of methods for differential expression analysis of RNA-seq data , 2013, BMC Bioinformatics.

[3]  S. Dudoit,et al.  Normalization of RNA-seq data using factor analysis of control genes or samples , 2014, Nature Biotechnology.

[4]  Daniel Bottomly,et al.  Evaluating Gene Expression in C57BL/6J and DBA/2J Mouse Striatum Using RNA-Seq and Microarrays , 2011, PloS one.

[5]  Laura L. Elo,et al.  Comparison of software packages for detecting differential expression in RNA-seq studies , 2013, Briefings Bioinform..

[6]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[7]  D. Altman,et al.  STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT , 1986, The Lancet.

[8]  C. Mason,et al.  Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data , 2013, Genome Biology.

[9]  Alyssa C. Frazee,et al.  ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets , 2011, BMC Bioinformatics.

[10]  Juliana Costa-Silva,et al.  RNA-Seq differential expression analysis: An extended review and a software tool , 2017, PloS one.

[11]  Verónica Jiménez-Jacinto,et al.  Integrative Differential Expression Analysis for Multiple EXperiments (IDEAMEX): A Web Server Tool for Integrated RNA-Seq Data Analysis , 2019, Front. Genet..

[12]  Marie-Agnès Dillies,et al.  SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data , 2015, bioRxiv.

[13]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[14]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[15]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[16]  Davis J. McCarthy,et al.  Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation , 2012, Nucleic acids research.

[17]  Nicolas Delhomme,et al.  easyRNASeq: a bioconductor package for processing RNA-Seq data , 2012, Bioinform..

[18]  Yan Guo,et al.  MultiRankSeq: Multiperspective Approach for RNAseq Differential Expression Analysis and Quality Control , 2014, BioMed research international.

[19]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[20]  J. Coppee,et al.  SARTools : a DESeq 2-and edgeR-based R pipeline for comprehensive differential analysis of RNA-Seq data , 2015 .

[21]  Scott T. Weiss,et al.  RNA-Seq Transcriptome Profiling Identifies CRISPLD2 as a Glucocorticoid Responsive Gene that Modulates Cytokine Function in Airway Smooth Muscle Cells , 2014, PloS one.

[22]  Panagiotis Moulos,et al.  Systematic integration of RNA-Seq statistical algorithms for accurate detection of differential gene expression patterns , 2014, Nucleic acids research.

[23]  Bingqing Lin,et al.  Stability of methods for differential expression analysis of RNA-seq data , 2019, BMC Genomics.