Vireo: Bayesian demultiplexing of pooled single-cell RNA-seq data without genotype reference

The joint analysis of multiple samples using single-cell RNA-seq is a promising experimental design, offering both increased throughput while allowing to account for batch variation. To achieve multi-sample designs, genetic variants that segregate between the samples in the pool have been proposed as natural barcodes for cell demultiplexing. Existing demultiplexing strategies rely on access to complete genotype data from the pooled samples, which greatly limits the applicability of such methods, in particular when genetic variation is not the primary object of study. To address this, we here present Vireo, a computationally efficient Bayesian model to demultiplex single-cell data from pooled experimental designs. Uniquely, our model can be applied in settings when only partial or no genotype information is available. Using simulations based on synthetic mixtures and results on real data, we demonstrate the robustness of our model and illustrate the utility of multi-sample experimental designs for common expression analyses.

[1]  S. Teichmann,et al.  A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications , 2017, Genome Medicine.

[2]  Rona S. Gertner,et al.  Single-Cell Genomics Unveils Critical Regulators of Th17 Cell Pathogenicity , 2015, Cell.

[3]  Bertrand Z. Yeung,et al.  Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics , 2018, Genome Biology.

[4]  Duhee Bang,et al.  Multiplexed single-cell RNA-seq via transient barcoding for simultaneous expression profiling of various drug perturbations , 2019, Science Advances.

[5]  Chun Jimmie Ye,et al.  Multiplexed droplet single-cell RNA-sequencing using natural genetic variation , 2017, Nature Biotechnology.

[6]  Sarah A. Teichmann,et al.  Cardelino: Integrating whole exomes and single-cell transcriptomes to reveal phenotypic impact of somatic variants , 2018, bioRxiv.

[7]  David A. Knowles,et al.  Batch effects and the effective design of single-cell gene expression studies , 2016, Scientific Reports.

[8]  Fabian J Theis,et al.  SCANPY: large-scale single-cell gene expression data analysis , 2018, Genome Biology.

[9]  Jennifer L Hu,et al.  MULTI-seq: sample multiplexing for single-cell RNA sequencing using lipid-tagged indices , 2019, Nature Methods.

[10]  R. Irizarry,et al.  Missing data and technical variability in single‐cell RNA‐sequencing experiments , 2018, Biostatistics.

[11]  Lior Pachter,et al.  Highly Multiplexed Single-Cell RNA-seq for Defining Cell Population and Transcriptional Spaces , 2018, bioRxiv.

[12]  N. Beerenwinkel,et al.  Single-Cell RNA-Seq Reveals Transcriptional Heterogeneity in Latent and Reactivated HIV-Infected Cells. , 2018, Cell reports.

[13]  A global reference for human genetic variation , 2015, Nature.

[14]  Davis J. McCarthy,et al.  Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation , 2012, Nucleic acids research.

[15]  Evan Z. Macosko,et al.  Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets , 2015, Cell.

[16]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[17]  Grace X. Y. Zheng,et al.  Massively parallel digital transcriptional profiling of single cells , 2016, Nature Communications.

[18]  Michael J. T. Stubbington,et al.  Single-cell transcriptomics to explore the immune system in health and disease , 2017, Science.

[19]  Davis J. McCarthy,et al.  Single-cell RNA-sequencing of differentiating iPS cells reveals dynamic genetic effects on gene expression , 2019, bioRxiv.

[20]  Boxi Kang,et al.  Landscape of Infiltrating T Cells in Liver Cancer Revealed by Single-Cell Sequencing , 2017, Cell.

[21]  Edwin Cheung,et al.  Single-Cell Transcriptome Analysis Reveals Estrogen Signaling Coordinately Augments One-Carbon, Polyamine, and Purine Synthesis in Breast Cancer. , 2018, Cell reports.

[22]  R. Sandberg,et al.  Single-cell transcriptomics uncovers distinct molecular signatures of stem cells in chronic myeloid leukemia , 2017, Nature Medicine.

[23]  Davis J. McCarthy,et al.  Common genetic variation drives molecular heterogeneity in human iPSCs , 2017, Nature.

[24]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.