Differential meta-analysis of RNA-seq data from multiple studies

BackgroundHigh-throughput sequencing is now regularly used for studies of the transcriptome (RNA-seq), particularly for comparisons among experimental conditions. For the time being, a limited number of biological replicates are typically considered in such experiments, leading to low detection power for differential expression. As their cost continues to decrease, it is likely that additional follow-up studies will be conducted to re-address the same biological question.ResultsWe demonstrate how p-value combination techniques previously used for microarray meta-analyses can be used for the differential analysis of RNA-seq data from multiple related studies. These techniques are compared to a negative binomial generalized linear model (GLM) including a fixed study effect on simulated data and real data on human melanoma cell lines. The GLM with fixed study effect performed well for low inter-study variation and small numbers of studies, but was outperformed by the meta-analysis methods for moderate to large inter-study variability and larger numbers of studies.ConclusionsThe p-value combination techniques illustrated here are a valuable tool to perform differential meta-analyses of RNA-seq data by appropriately accounting for biological and technical variability within studies as well as additional study-specific effects. An R package metaRNASeq is available on the CRAN (http://cran.r-project.org/web/packages/metaRNASeq).

[1]  E. Suchman,et al.  The American soldier: Adjustment during army life. (Studies in social psychology in World War II), Vol. 1 , 1949 .

[2]  Jean-Louis Foulley,et al.  Gene expression Moderated effect size and P-value combinations for microarray meta-analyses , 2009 .

[3]  Sangsoo Kim,et al.  Combining multiple microarray studies and modeling interstudy variation , 2003, ISMB.

[4]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology High-Dimensional Regression and Variable Selection Using CAR Scores , 2011 .

[5]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[6]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[7]  Hanbo Chen,et al.  VennDiagram: a package for the generation of highly-customizable Venn and Euler diagrams in R , 2011, BMC Bioinformatics.

[8]  J. Hemelrijk,et al.  Some remarks on the combination of independent tests , 1953 .

[9]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[10]  Art B. Owen,et al.  Karl Pearson’s meta analysis revisited , 2009, 0911.3531.

[11]  K. Pearson ON A NEW METHOD OF DETERMINING “GOODNESS OF FIT.” , 1934 .

[12]  G. Tseng,et al.  Comprehensive literature review and statistical considerations for GWAS meta-analysis , 2012, Nucleic acids research.

[13]  Rainer Breitling,et al.  A comparison of meta-analysis methods for detecting differentially expressed genes in microarray experiments , 2008, Bioinform..

[14]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[15]  R. Fisher,et al.  Statistical Methods for Research Workers , 1930, Nature.

[16]  Rainer Breitling,et al.  Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments , 2004, FEBS letters.

[17]  Nicolas Servant,et al.  A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis , 2013, Briefings Bioinform..

[18]  C. Bertolotto,et al.  Essential role of microphthalmia transcription factor for DNA replication, mitosis and genomic stability in melanoma , 2011, Oncogene.

[19]  Stephan Morgenthaler,et al.  Meta Analysis: A Guide to Calibrating and Combining Statistical Evidence , 2008 .

[20]  Joseph Beyene,et al.  Statistical Methods for Meta-Analysis of Microarray Data: A Comparative Study , 2006, Inf. Syst. Frontiers.

[21]  Guillemette Marot,et al.  Statistical Applications in Genetics and Molecular Biology Sequential Analysis for Microarray Data Based on Sensitivity and Meta-Analysis , 2011 .

[22]  E. Suchman,et al.  The American Soldier: Adjustment During Army Life. , 1949 .

[23]  L. AuerPaul,et al.  A Two-Stage Poisson Model for Testing RNA-Seq Data , 2011 .

[24]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[25]  Jean-Louis Foulley,et al.  A structural mixed model for variances in differential gene expression studies. , 2007, Genetical research.

[26]  G. Tseng,et al.  Comprehensive literature review and statistical considerations for microarray meta-analysis , 2012, Nucleic acids research.

[27]  Charlotte Soneson,et al.  A comparison of methods for differential expression analysis of RNA-seq data , 2013, BMC Bioinformatics.

[28]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[29]  Gilles Celeux,et al.  Data-based filtering for replicated high-throughput transcriptome sequencing experiments , 2013, Bioinform..