MLG: multilayer graph clustering for multi-condition scRNA-seq data

Single-cell transcriptome sequencing (scRNA-seq) enabled investigations of cellular heterogeneity at exceedingly higher resolutions. Identification of novel cell types or transient developmental stages across multiple experimental conditions is one of its key applications. Linear and non-linear dimensionality reduction for data integration became a foundational tool in inference from scRNA-seq data. We present Multi Layer Graph Clustering (MLG) as an integrative approach for combining multiple dimensionality reduction of multi-condition scRNA-seq data. MLG generates a multilayer shared nearest neighbor cell graph with higher signal-to-noise ratio and outperforms current best practices in terms of clustering accuracy across large-scale bench-marking experiments. Application of MLG to a wide variety of datasets from multiple conditions highlights how MLG boosts signal-to-noise ratio for fine-grained sub-population identification. MLG is widely applicable to settings with single cell data integration via dimension reduction.

[1]  G. Pazour,et al.  Ror2 signaling regulates Golgi structure and transport through IFT20 for tumor invasiveness , 2017, Scientific Reports.

[2]  S. Linnarsson,et al.  Conserved properties of dentate gyrus neurogenesis across postnatal development revealed by single-cell RNA sequencing , 2018, Nature Neuroscience.

[3]  Bruce J. Aronow,et al.  Single-cell analysis of mixed-lineage states leading to a binary cell fate choice , 2016, Nature.

[4]  G. Sanguinetti,et al.  scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells , 2018, Nature Communications.

[5]  Thomas Lengauer,et al.  Improved scoring of functional groups from gene expression data by decorrelating GO graph structure , 2006, Bioinform..

[6]  Aviv Regev,et al.  Heterogeneous Responses of Hematopoietic Stem Cells to Inflammatory Stimuli are Altered with Age , 2017, bioRxiv.

[7]  S. Maier,et al.  Plasmon induced thermoelectric effect in graphene , 2018, Nature Communications.

[8]  S. Linnarsson,et al.  Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq , 2015, Science.

[9]  M. Cugmas,et al.  On comparing partitions , 2015 .

[10]  Anderson Y. Zhang,et al.  Minimax Rates of Community Detection in Stochastic Block Models , 2015, ArXiv.

[11]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[12]  Yi Zhang,et al.  Single-Cell RNA-Seq Reveals Hypothalamic Cell Diversity. , 2017, Cell reports.

[13]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[14]  M. Robinson,et al.  A systematic performance evaluation of clustering methods for single-cell RNA-seq data. , 2018, F1000Research.

[15]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[16]  G. Quon,et al.  scAlign: a tool for alignment, integration, and rare cell identification from scRNA-seq data , 2019, Genome Biology.

[17]  Yong Wang,et al.  scTIM: seeking cell-type-indicative marker from single cell RNA-seq data by consensus optimization , 2019, Bioinform..

[18]  Kok Siong Ang,et al.  A benchmark of batch-effect correction methods for single-cell RNA sequencing data , 2020, Genome Biology.

[19]  B. Tjaden,et al.  De novo assembly of bacterial transcriptomes from RNA-seq data , 2015, Genome Biology.

[20]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[21]  S. Durinck,et al.  Single-cell RNA sequencing identifies distinct mouse medial ganglionic eminence cell types , 2017, Scientific Reports.

[22]  Monika S. Kowalczyk,et al.  Single-cell RNA-seq reveals changes in cell cycle and differentiation programs upon aging of hematopoietic stem cells , 2015, Genome research.

[23]  Kirby D. Johnson,et al.  Constructing and deconstructing GATA2-regulated cell fate programs to establish developmental trajectories , 2020, The Journal of experimental medicine.

[24]  Kun Zhang,et al.  High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell , 2019, Nature Biotechnology.

[25]  Kamil Slowikowski,et al.  Fast, sensitive, and accurate integration of single cell data with Harmony , 2019, Nature Methods.

[26]  Paul Hoffman,et al.  Integrating single-cell transcriptomic data across different conditions, technologies, and species , 2018, Nature Biotechnology.

[27]  Y. Xing,et al.  A Transcriptome Database for Astrocytes, Neurons, and Oligodendrocytes: A New Resource for Understanding Brain Development and Function , 2008, The Journal of Neuroscience.

[28]  Michael I. Jordan,et al.  Deep Generative Modeling for Single-cell Transcriptomics , 2018, Nature Methods.

[29]  Andrew C. Adey,et al.  Multiplex single-cell profiling of chromatin accessibility by combinatorial cellular indexing , 2015, Science.

[30]  Paul J. Hoffman,et al.  Comprehensive Integration of Single-Cell Data , 2018, Cell.

[31]  P. Carmeliet,et al.  Phenotype molding of stromal cells in the lung tumor microenvironment , 2018, Nature Medicine.

[32]  Yaniv Lubling,et al.  Single-cell characterization of haematopoietic progenitors and their trajectories in homeostasis and perturbed haematopoiesis , 2018, Nature Cell Biology.

[33]  K. Holt,et al.  Performance of neural network basecalling tools for Oxford Nanopore sequencing , 2019, Genome Biology.

[34]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[35]  Bernardo J. Clavijo,et al.  Rapid transcriptional plasticity of duplicated gene clusters enables a clonally reproducing aphid to colonise diverse plant species , 2017, Genome Biology.

[36]  A. Regev,et al.  Temporal Tracking of Microglia Activation in Neurodegeneration at Single-Cell Resolution , 2017, Cell reports.

[37]  A. Oshlack,et al.  Splatter: simulation of single-cell RNA sequencing data , 2017, Genome Biology.

[38]  Pardis C. Sabeti,et al.  Identifying Gene Expression Programs of Cell-type Identity and Cellular Activity with Single-Cell RNA-Seq , 2018 .

[39]  Caleb Weinreb,et al.  SPRING: a kinetic interface for visualizing high dimensional single-cell expression data , 2017, bioRxiv.

[40]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[41]  David E. Muench,et al.  Mouse models of neutropenia reveal progenitor-stage-specific defects , 2020, Nature.

[42]  Luyi Tian,et al.  Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data , 2018, F1000Research.

[43]  Luyi Tian,et al.  Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments , 2019, Nature Methods.

[44]  P. Linsley,et al.  MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data , 2015, Genome Biology.

[45]  Vinay K. Kartha,et al.  Chromatin Potential Identified by Shared Single-Cell Profiling of RNA and Chromatin , 2020, Cell.

[46]  Evan Z. Macosko,et al.  Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity , 2019, Cell.

[47]  M. Hemberg,et al.  Challenges in unsupervised clustering of single-cell RNA-seq data , 2019, Nature Reviews Genetics.

[48]  Raphael Gottardo,et al.  Integrated analysis of multimodal single-cell data , 2020, Cell.

[49]  Jianzhu Ma,et al.  Robust single-cell Hi-C clustering by convolution- and random-walk–based imputation , 2019, Proceedings of the National Academy of Sciences.

[50]  Pardis C Sabeti,et al.  Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-Seq , 2018, bioRxiv.

[51]  Luke Zappia,et al.  Opportunities and challenges in long-read sequencing data analysis , 2020, Genome Biology.

[52]  Colin N. Dewey,et al.  Cis-regulatory mechanisms governing stem and progenitor cell transitions , 2015, Science Advances.

[53]  Fabian J Theis,et al.  SCANPY: large-scale single-cell gene expression data analysis , 2018, Genome Biology.

[54]  J. Seidman,et al.  Single-Cell Resolution of Temporal Gene Expression during Heart Development. , 2016, Developmental cell.