scCancer: a package for automated processing of single-cell RNA-seq data in cancer

Molecular heterogeneities and complex microenvironments bring great challenges for cancer diagnosis and treatment. Recent advances in single-cell RNA-sequencing (scRNA-seq) technology make it possible to study cancer cell heterogeneities and microenvironments at single-cell transcriptomic level. Here, we develop an R package named scCancer, which focuses on processing and analyzing scRNA-seq data for cancer research. Except basic data processing steps, this package takes several special considerations for cancer-specific features. Firstly, the package introduced comprehensive quality control metrics. Secondly, it used a data-driven machine learning algorithm to accurately identify major cancer microenvironment cell populations. Thirdly, it estimated a malignancy score to classify malignant (cancerous) and non-malignant cells. Then, it analyzed intra-tumor heterogeneities by key cellular phenotypes (such as cell cycle and stemness), gene signatures and cell-cell interactions. Besides, it provided multi-sample data integration analysis with different batch-effect correction strategies. Finally, user-friendly graphic reports were generated for all the analyses. By testing on 56 samples with 433 405 cells in total, we demonstrated its good performance. The package is available at: http://lifeome.net/software/sccancer/.

[1]  Sarah A. Teichmann,et al.  Single-cell transcriptomes from human kidneys reveal the cellular identity of renal tumors , 2018, Science.

[2]  Ambrose J. Carr,et al.  Single-Cell Map of Diverse Immune Phenotypes in the Breast Tumor Microenvironment , 2018, Cell.

[3]  Jie Zheng,et al.  Emerging deep learning methods for single-cell RNA-seq data analysis , 2019, Quantitative Biology.

[4]  Matthew D. Young,et al.  SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data , 2018, bioRxiv.

[5]  Nancy R. Zhang,et al.  The Human Tumor Atlas Network: Charting Tumor Transitions across Space and Time at Single-Cell Resolution , 2020, Cell.

[6]  Evan Z. Macosko,et al.  Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity , 2019, Cell.

[7]  Justin Guinney,et al.  GSVA: gene set variation analysis for microarray and RNA-Seq data , 2013, BMC Bioinformatics.

[8]  Paul J. Hoffman,et al.  Comprehensive Integration of Single-Cell Data , 2018, Cell.

[9]  Aleksandra A. Kolodziejczyk,et al.  Classification of low quality cells from single-cell RNA-seq data , 2016, Genome Biology.

[10]  P. Carmeliet,et al.  Phenotype molding of stromal cells in the lung tumor microenvironment , 2018, Nature Medicine.

[11]  Michael Q. Zhang,et al.  Network embedding-based representation learning for single cell RNA-seq data , 2017, Nucleic acids research.

[12]  Shawn M. Gillespie,et al.  Single-Cell Transcriptomic Analysis of Primary and Metastatic Tumor Ecosystems in Head and Neck Cancer , 2017, Cell.

[13]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Paul Hoffman,et al.  Integrating single-cell transcriptomic data across different conditions, technologies, and species , 2018, Nature Biotechnology.

[15]  Mariella G. Filbin,et al.  An Integrative Model of Cellular States, Plasticity, and Genetics for Glioblastoma , 2019, Cell.

[16]  Joshua M. Stuart,et al.  Machine Learning Identifies Stemness Features Associated with Oncogenic Dedifferentiation. , 2018, Cell.

[17]  Patrick M. Reed,et al.  Low cost satellite constellations for nearly continuous global coverage , 2020, Nature Communications.

[18]  Aaron T. L. Lun,et al.  Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R , 2017, Bioinform..

[19]  Deepali V. Sawant,et al.  Single-Cell Analyses Inform Mechanisms of Myeloid-Targeted Therapies in Colon Cancer , 2020, Cell.

[20]  Artem Sokolov,et al.  One-Class Detection of Cell States in Tumor Subtypes , 2016, PSB.

[21]  A. van Oudenaarden,et al.  Single-cell sequencing reveals dissociation-induced gene expression in tissue subpopulations , 2017, Nature Methods.

[22]  Dennis Kostka,et al.  scds: computational annotation of doublets in single-cell RNA sequencing data , 2019, Bioinform..

[23]  Samantha Riesenfeld,et al.  EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data , 2019, Genome Biology.

[24]  S. Mozaffari,et al.  Superconducting phase diagram of H3S under high magnetic fields , 2019, Nature Communications.

[25]  Gustavo Stolovitzky,et al.  Intratumoral heterogeneity and clonal evolution in liver cancer , 2020, Nature Communications.

[26]  Shuang Wu,et al.  Evaluation of single-cell classifiers for single-cell RNA sequencing data sets , 2019, Briefings Bioinform..

[27]  Sarah A Teichmann,et al.  A test metric for assessing single-cell RNA-seq batch correction , 2018, Nature Methods.

[28]  Jeong Eon Lee,et al.  Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer , 2017, Nature Communications.

[29]  Douglas A. Lauffenburger,et al.  Analysis of Single-Cell RNA-Seq Identifies Cell-Cell Communication Associated with Tumor Characteristics , 2018, Cell reports.

[30]  Li Chen,et al.  A Bayesian mixture model for clustering droplet-based single-cell transcriptomic data from population studies , 2019, Nature Communications.

[31]  Grace X. Y. Zheng,et al.  Massively parallel digital transcriptional profiling of single cells , 2016, Nature Communications.

[32]  R. Satija,et al.  Integrative single-cell analysis , 2019, Nature Reviews Genetics.

[33]  Yun-Gui Yang,et al.  Single-cell RNA-seq highlights intra-tumoral heterogeneity and malignant progression in pancreatic ductal adenocarcinoma , 2019, Cell Research.

[34]  Kamil Slowikowski,et al.  Fast, sensitive, and accurate integration of single cell data with Harmony , 2019, Nature Methods.

[35]  C. Tyler-Smith,et al.  Ancient DNA and the rewriting of human history: be sparing with Occam’s razor , 2016, Genome Biology.

[36]  Kok Siong Ang,et al.  A benchmark of batch-effect correction methods for single-cell RNA sequencing data , 2020, Genome Biology.

[37]  Shawn M. Gillespie,et al.  Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma , 2014, Science.

[38]  Cole Trapnell,et al.  Supervised classification enables rapid annotation of cell atlases , 2019, Nature Methods.

[39]  Piero Carninci,et al.  A draft network of ligand–receptor-mediated multicellular signalling in human , 2015, Nature Communications.

[40]  Johan Hartman,et al.  Chemoresistance Evolution in Triple-Negative Breast Cancer Delineated by Single-Cell Sequencing , 2018, Cell.

[41]  John C Marioni,et al.  A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor , 2016, F1000Research.

[42]  Y. Kluger,et al.  Single-cell connectomic analysis of adult mammalian lungs , 2019, Science Advances.

[43]  Jin Gu,et al.  VASC: Dimension Reduction and Visualization of Single-cell RNA-seq Data by Deep Variational Autoencoder , 2018, Genom. Proteom. Bioinform..

[44]  Fabian J Theis,et al.  SCANPY: large-scale single-cell gene expression data analysis , 2018, Genome Biology.

[45]  D. Choudhuri,et al.  Exceptional increase in the creep life of magnesium rare-earth alloys due to localized bond stiffening , 2017, Nature Communications.