Interpretation of 'Omics dynamics in a single subject using local estimates of dispersion between two transcriptomes

Calculating Differentially Expressed Genes (DEGs) from RNA-sequencing requires replicates to estimate gene-wise variability, infeasible in clinics. By imposing restrictive transcriptome-wide assumptions limiting inferential opportunities of conventional methods (edgeR, NOISeq-sim, DESeq, DEGseq), comparing two conditions without replicates (TCWR) has been proposed, but not evaluated. Under TCWR conditions (e.g., unaffected tissue vs. tumor), differences of transformed expression of the proposed individualized DEG (iDEG) method follow a distribution calculated across a local partition of related transcripts at baseline expression; thereafter the probability of each DEG is estimated by empirical Bayes with local false discovery rate control using a two-group mixture model. In extensive simulation studies of TCWR methods, iDEG and NOISeq are more accurate at 5%<DEGs<20% (precision>90%, recall>75%, false_positive_rate<1%) and 30%<DEGs<40% (precision=recall∼90%), respectively. The proposed iDEG method borrows localized distribution information from the same individual, a strategy that improves accuracy to compare transcriptomes in absence of replicates at low DEGs conditions. http://www.lussiergroup.org/publications/iDEG

[1]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[2]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[3]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[4]  B. Efron Large-Scale Simultaneous Hypothesis Testing , 2004 .

[5]  Qian Wang,et al.  GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data , 2012, Bioinform..

[6]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[7]  Juliana Costa-Silva,et al.  RNA-Seq differential expression analysis: An extended review and a software tool , 2017, PloS one.

[8]  Joaquín Dopazo,et al.  NOIseq: a RNA-seq differential expression method robust for sequencing depth biases , 2012 .

[9]  Charity W. Law,et al.  voom: precision weights unlock linear model analysis tools for RNA-seq read counts , 2014, Genome Biology.

[10]  Eric J Topol,et al.  Individualized Medicine from Prewomb to Tomb , 2014, Cell.

[11]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[12]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[13]  Yves A. Lussier,et al.  N-of-1-pathways MixEnrich: advancing precision medicine via single-subject analysis in discovering dynamic changes of transcriptomes , 2017, BMC Medical Genomics.

[14]  Charlotte Soneson,et al.  Bias, robustness and scalability in single-cell differential expression analysis , 2018, Nature Methods.

[15]  Samir Rachid Zaim,et al.  Evaluating single-subject study methods for personal transcriptomic interpretations to advance precision medicine , 2018 .

[16]  Ian T. Foster,et al.  ‘N-of-1-pathways’ unveils personal deregulated mechanisms from a single pair of RNA-Seq samples: towards precision medicine , 2014, J. Am. Medical Informatics Assoc..

[17]  F. Hampel The Influence Curve and Its Role in Robust Estimation , 1974 .

[18]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[19]  Jeff H. Chang,et al.  The NBP Negative Binomial Model for Assessing Differential Gene Expression from RNA-Seq , 2011 .

[20]  M. Robinson,et al.  Small-sample estimation of negative binomial dispersion, with applications to SAGE data. , 2007, Biostatistics.

[21]  Yves A. Lussier,et al.  Emergence of pathway-level composite biomarkers from converging gene set signals of heterogeneous transcriptomic responses , 2018, PSB.

[22]  N. Schork Personalized medicine: Time for one-person trials , 2015, Nature.

[23]  R. Tibshirani,et al.  Using specially designed exponential families for density estimation , 1996 .

[24]  Wolfgang Huber,et al.  Differential expression of RNA-Seq data at the gene level – the DESeq package , 2012 .

[25]  Yves A. Lussier,et al.  kMEn: Analyzing noisy and bidirectional transcriptional pathway responses in single subjects , 2017, J. Biomed. Informatics.

[26]  N. Laubscher,et al.  On Stabilizing the Binomial and Negative Binomial Variances , 1961 .

[27]  B. Efron Size, power and false discovery rates , 2007, 0710.2245.

[28]  Jocelyn Kaiser,et al.  Obama gives East Room rollout to Precision Medicine Initiative , 2015 .

[29]  Yves A. Lussier,et al.  Dynamic changes of RNA-sequencing expression for precision medicine: N-of-1-pathways Mahalanobis distance within pathways of single subjects predicts breast cancer survival , 2015, Bioinform..

[30]  B. Efron Correlation and Large-Scale Simultaneous Significance Testing , 2007 .

[31]  Christian P. Robert,et al.  Large-scale inference , 2010 .

[32]  Joanne Berghout,et al.  Developing a ‘personalome’ for precision medicine: emerging methods that compute interpretable effect sizes from single-subject transcriptomes , 2017, Briefings Bioinform..

[33]  Adrian E. Raftery,et al.  Normal uniform mixture differential gene expression detection for cDNA microarrays , 2005, BMC Bioinformatics.

[34]  Xuegong Zhang,et al.  DEGseq: an R package for identifying differentially expressed genes from RNA-seq data , 2010, Bioinform..

[35]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..