SCC: an accurate imputation method for scRNA-seq dropouts based on a mixture model

Background Single-cell RNA sequencing (scRNA-seq) enables the possibility of many in-depth transcriptomic analyses at a single-cell resolution. It’s already widely used for exploring the dynamic development process of life, studying the gene regulation mechanism, and discovering new cell types. However, the low RNA capture rate, which cause highly sparse expression with dropout, makes it difficult to do downstream analyses. Results We propose a new method SCC to impute the dropouts of scRNA-seq data. Experiment results show that SCC gives competitive results compared to two existing methods while showing superiority in reducing the intra-class distance of cells and improving the clustering accuracy in both simulation and real data. Conclusions SCC is an effective tool to resolve the dropout noise in scRNA-seq data. The code is freely accessible at https://github.com/nwpuzhengyan/SCC .

[1]  Wei Vivian Li,et al.  An accurate and robust imputation method scImpute for single-cell RNA-seq data , 2018, Nature Communications.

[2]  C. Greenwood,et al.  A smoothed EM-algorithm for DNA methylation profiles from sequencing-based methods in cell lines or for a single cell type , 2017, Statistical applications in genetics and molecular biology.

[3]  Fabian J. Theis,et al.  Model-based branching point detection in single-cell data by K-branches clustering , 2016, bioRxiv.

[4]  R. Satija,et al.  Single-cell RNA sequencing to explore immune cell heterogeneity , 2017, Nature Reviews Immunology.

[5]  Bo Ding,et al.  Normalization and noise reduction for single cell RNA-seq experiments , 2015, Bioinform..

[6]  Mingyao Li,et al.  Single-cell transcriptomics of the mouse kidney reveals potential cellular targets of kidney disease , 2018, Science.

[7]  P. Linsley,et al.  MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data , 2015, Genome Biology.

[8]  S. Potter,et al.  Single-cell RNA sequencing for the study of development, physiology and disease , 2018, Nature Reviews Nephrology.

[9]  M. Hemberg,et al.  scmap: projection of single-cell RNA-seq data across data sets , 2018, Nature Methods.

[10]  Bo Wang,et al.  Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning , 2016, Nature Methods.

[11]  James Hicks,et al.  Unravelling biology and shifting paradigms in cancer with single-cell sequencing , 2017, Nature Reviews Cancer.

[12]  Sergey I. Nikolenko,et al.  BayesHammer: Bayesian clustering for error correction in single-cell sequencing , 2012, BMC Genomics.

[13]  Benoit Rivard,et al.  Intra- and inter-class spectral variability of tropical tree species at La Selva, Costa Rica: Implications for species identification using HYDICE imagery , 2006 .

[14]  Joshua W. K. Ho,et al.  CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data , 2016, Genome Biology.

[15]  M. Schaub,et al.  SC3 - consensus clustering of single-cell RNA-Seq data , 2016, Nature Methods.