MultiRankSeq: Multiperspective Approach for RNAseq Differential Expression Analysis and Quality Control

Background. After a decade of microarray technology dominating the field of high-throughput gene expression profiling, the introduction of RNAseq has revolutionized gene expression research. While RNAseq provides more abundant information than microarray, its analysis has proved considerably more complicated. To date, no consensus has been reached on the best approach for RNAseq-based differential expression analysis. Not surprisingly, different studies have drawn different conclusions as to the best approach to identify differentially expressed genes based upon their own criteria and scenarios considered. Furthermore, the lack of effective quality control may lead to misleading results interpretation and erroneous conclusions. To solve these aforementioned problems, we propose a simple yet safe and practical rank-sum approach for RNAseq-based differential gene expression analysis named MultiRankSeq. MultiRankSeq first performs quality control assessment. For data meeting the quality control criteria, MultiRankSeq compares the study groups using several of the most commonly applied analytical methods and combines their results to generate a new rank-sum interpretation. MultiRankSeq provides a unique analysis approach to RNAseq differential expression analysis. MultiRankSeq is written in R, and it is easily applicable. Detailed graphical and tabular analysis reports can be generated with a single command line.

[1]  Jiang Li,et al.  Large Scale Comparison of Gene Expression Levels by Microarrays and RNAseq Using TCGA Data , 2013, PloS one.

[2]  David I. Smith,et al.  3' tag digital gene expression profiling of human brain and universal reference RNA using Illumina Genome Analyzer , 2009, BMC Genomics.

[3]  C. Orengo,et al.  Microarray analysis after RNA amplification can detect pronounced differences in gene expression using limma , 2006, BMC Genomics.

[4]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[5]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[6]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[7]  K. Hansen,et al.  Removing technical variability in RNA-seq data using conditional quantile normalization , 2012, Biostatistics.

[8]  Charlotte Soneson,et al.  A comparison of methods for differential expression analysis of RNA-seq data , 2013, BMC Bioinformatics.

[9]  Xuegong Zhang,et al.  DEGseq: an R package for identifying differentially expressed genes from RNA-seq data , 2010, Bioinform..

[10]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[11]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[12]  S. Ranade,et al.  Stem cell transcriptome profiling via massive-scale mRNA sequencing , 2008, Nature Methods.

[13]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[14]  Vanessa M Kvam,et al.  A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data. , 2012, American journal of botany.

[15]  A. Conesa,et al.  Differential expression in RNA-seq: a matter of depth. , 2011, Genome research.

[16]  Chris Williams,et al.  RNA-SeQC: RNA-seq metrics for quality control and process optimization , 2012, Bioinform..

[17]  Y. Shyr,et al.  Evaluation of read count based RNAseq analysis methods , 2013, BMC Genomics.

[18]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[19]  Susan R. Wilson,et al.  Efficient experimental design and analysis strategies for the detection of differential expression using RNA-Sequencing , 2012, BMC Genomics.

[20]  M. E. Johnson,et al.  A Comparative Study of Tests for Homogeneity of Variances, with Applications to the Outer Continental Shelf Bidding Data , 1981 .

[21]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[22]  Soile Tapio,et al.  Supplementary table 1 , 2014 .

[23]  Nicolas Servant,et al.  A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis , 2013, Briefings Bioinform..

[24]  Wei Li,et al.  RSeQC: quality control of RNA-seq experiments , 2012, Bioinform..

[25]  Thomas J. Hardcastle,et al.  baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data , 2010, BMC Bioinformatics.

[26]  L. AuerPaul,et al.  A Two-Stage Poisson Model for Testing RNA-Seq Data , 2011 .

[27]  Jeff H. Chang,et al.  The NBP Negative Binomial Model for Assessing Differential Gene Expression from RNA-Seq , 2011 .

[28]  J. Shendure The beginning of the end for microarrays? , 2008, Nature Methods.

[29]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[30]  Robert Tibshirani,et al.  Finding consistent patterns: A nonparametric approach for identifying differential expression in RNA-Seq data , 2013, Statistical methods in medical research.

[31]  Yu Shyr,et al.  Weighted Flexible Compound Covariate Method for Classifying Microarray Data , 2003 .

[32]  Yan Guo,et al.  Three-stage quality control strategies for DNA re-sequencing data , 2014, Briefings Bioinform..