Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data

Abstract Recent technological breakthroughs have made it possible to measure RNA expression at the single-cell level, thus paving the way for exploring expression heterogeneity among individual cells. Current single-cell RNA sequencing (scRNA-seq) protocols are complex and introduce technical biases that vary across cells, which can bias downstream analysis without proper adjustment. To account for cell-to-cell technical differences, we propose a statistical framework, TASC (Toolkit for Analysis of Single Cell RNA-seq), an empirical Bayes approach to reliably model the cell-specific dropout rates and amplification bias by use of external RNA spike-ins. TASC incorporates the technical parameters, which reflect cell-to-cell batch effects, into a hierarchical mixture model to estimate the biological variance of a gene and detect differentially expressed genes. More importantly, TASC is able to adjust for covariates to further eliminate confounding that may originate from cell size and cell cycle differences. In simulation and real scRNA-seq data, TASC achieves accurate Type I error control and displays competitive sensitivity and improved robustness to batch effects in differential expression analysis, compared to existing methods. TASC is programmed to be computationally efficient, taking advantage of multi-threaded parallelization. We believe that TASC will provide a robust platform for researchers to leverage the power of scRNA-seq.

[1]  P. Rorsman,et al.  Gene expression profiling in single cells from the pancreatic islets of Langerhans reveals lognormal distribution of mRNA levels. , 2005, Genome research.

[2]  D. Tranchina,et al.  Stochastic mRNA Synthesis in Mammalian Cells , 2006, PLoS biology.

[3]  M. Salit,et al.  Synthetic Spike-in Standards for Rna-seq Experiments Material Supplemental Open Access License Commons Creative , 2022 .

[4]  Junhyong Kim,et al.  The promise of single-cell sequencing , 2013, Nature Methods.

[5]  Rona S. Gertner,et al.  Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells , 2013, Nature.

[6]  R. Sandberg Entering the era of single-cell transcriptomics in biology and medicine , 2013, Nature Methods.

[7]  P. Kharchenko,et al.  Bayesian approach to single-cell differential expression analysis , 2014, Nature Methods.

[8]  Gioele La Manno,et al.  Quantitative single-cell RNA-seq with unique molecular identifiers , 2013, Nature Methods.

[9]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[10]  A. Oudenaarden,et al.  Validation of noise models for single-cell transcriptomics , 2014, Nature Methods.

[11]  Rona S. Gertner,et al.  Single cell RNA Seq reveals dynamic paracrine control of cellular variation , 2014, Nature.

[12]  P. Linsley,et al.  MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data , 2015, Genome Biology.

[13]  S. Linnarsson,et al.  Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq , 2015, Science.

[14]  E. Pierson,et al.  ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis , 2015, Genome Biology.

[15]  Aleksandra A. Kolodziejczyk,et al.  The technology and biology of single-cell RNA sequencing. , 2015, Molecular cell.

[16]  Aleksandra A. Kolodziejczyk,et al.  Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression , 2015, Nature Communications.

[17]  S. Teichmann,et al.  Computational and analytical challenges in single-cell transcriptomics , 2015, Nature Reviews Genetics.

[18]  Fabian J Theis,et al.  Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells , 2015, Nature Biotechnology.

[19]  A. Raj,et al.  Single mammalian cells compensate for differences in cellular volume and DNA copy number through independent global transcriptional mechanisms. , 2015, Molecular cell.

[20]  Catalina A. Vallejos,et al.  BASiCS: Bayesian Analysis of Single-Cell Sequencing Data , 2015, PLoS Comput. Biol..

[21]  J. Marioni,et al.  Pooling across cells to normalize single-cell RNA sequencing data with many zero counts , 2016, Genome Biology.

[22]  S. Richardson,et al.  Beyond comparisons of means: understanding changes in gene expression at the single-cell level , 2016, Genome Biology.

[23]  Rhonda Bacher,et al.  Design and computational analysis of single-cell RNA-sequencing experiments , 2016, Genome Biology.

[24]  Charles H. Yoon,et al.  Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq , 2016, Science.

[25]  Jong Kyoung Kim,et al.  Corrigendum: Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression , 2015, Nature Communications.

[26]  Ning Leng,et al.  OEFinder: a user interface to identify and visualize ordering effects in single-cell RNA-seq data , 2016, Bioinform..

[27]  David A. Knowles,et al.  Batch effects and the effective design of single-cell gene expression studies , 2016, Scientific Reports.

[28]  Valentine Svensson,et al.  Power Analysis of Single Cell RNA-Sequencing Experiments , 2016, Nature Methods.

[29]  Andrew J. Hill,et al.  Single-cell mRNA quantification and differential analysis with Census , 2017, Nature Methods.