Identification of cancer subtypes from single-cell RNA-seq data using a consensus clustering method

BackgroundHuman cancers are complex ecosystems composed of cells with distinct molecular signatures. Such intratumoral heterogeneity poses a major challenge to cancer diagnosis and treatment. Recent advancements of single-cell techniques such as scRNA-seq have brought unprecedented insights into cellular heterogeneity. Subsequently, a challenging computational problem is to cluster high dimensional noisy datasets with substantially fewer cells than the number of genes.MethodsIn this paper, we introduced a consensus clustering framework conCluster, for cancer subtype identification from single-cell RNA-seq data. Using an ensemble strategy, conCluster fuses multiple basic partitions to consensus clusters.ResultsApplied to real cancer scRNA-seq datasets, conCluster can more accurately detect cancer subtypes than the widely used scRNA-seq clustering methods. Further, we conducted co-expression network analysis for the identified melanoma subtypes.ConclusionsOur analysis demonstrates that these subtypes exhibit distinct gene co-expression networks and significant gene sets with different functional enrichment.

[1]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[2]  L. Hubert,et al.  Comparing partitions , 1985 .

[3]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[4]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[5]  J. Troge,et al.  Tumour evolution inferred by single-cell sequencing , 2011, Nature.

[6]  Corbin E. Meacham,et al.  Tumour heterogeneity and cancer cell plasticity , 2013, Nature.

[7]  Benjamin J. Raphael,et al.  THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data , 2013, Genome Biology.

[8]  P. Kharchenko,et al.  Bayesian approach to single-cell differential expression analysis , 2014, Nature Methods.

[9]  A. Bouchard-Côté,et al.  PyClone: statistical inference of clonal population structure in cancer , 2014, Nature Methods.

[10]  Shawn M. Gillespie,et al.  Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma , 2014, Science.

[11]  S. Linnarsson,et al.  Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq , 2015, Science.

[12]  Evan Z. Macosko,et al.  Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets , 2015, Cell.

[13]  S. Teichmann,et al.  Computational and analytical challenges in single-cell transcriptomics , 2015, Nature Reviews Genetics.

[14]  Chen Xu,et al.  Identification of cell types from single-cell transcriptomes using a novel clustering method , 2015, Bioinform..

[15]  Hui Wang,et al.  SINCERA: A Pipeline for Single-Cell RNA-Seq Profiling Analysis , 2015, PLoS Comput. Biol..

[16]  Alex A. Pollen,et al.  Molecular Identity of Human Outer Radial Glia during Cortical Development , 2015, Cell.

[17]  M. Cugmas,et al.  On comparing partitions , 2015 .

[18]  Christoph Bock,et al.  Single‐cell transcriptomes reveal characteristic features of human pancreatic islet cell types , 2015, EMBO reports.

[19]  Christopher Yau,et al.  pcaReduce: hierarchical clustering of single cell transcriptional profiles , 2015, BMC Bioinformatics.

[20]  Hongkai Ji,et al.  TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis , 2016, Nucleic acids research.

[21]  Charles H. Yoon,et al.  Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq , 2016, Science.

[22]  S. Teichmann,et al.  A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications , 2017, Genome Medicine.

[23]  M. Schaub,et al.  SC3 - consensus clustering of single-cell RNA-Seq data , 2016, Nature Methods.

[24]  Jeong Eon Lee,et al.  Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer , 2017, Nature Communications.

[25]  Joshua W. K. Ho,et al.  CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data , 2016, Genome Biology.

[26]  Paul Hoffman,et al.  Integrating single-cell transcriptomic data across different conditions, technologies, and species , 2018, Nature Biotechnology.

[27]  Zixiang Wang,et al.  Computational identification of binding energy hot spots in protein–RNA complexes using an ensemble approach , 2018, Bioinform..

[28]  S. Aerts,et al.  Mapping gene regulatory networks from single-cell omics data , 2018, Briefings in functional genomics.