Computational methods for the integrative analysis of single-cell data

Recent advances in single-cell technologies are providing exciting opportunities for dissecting tissue heterogeneity and investigating cell identity, fate and function. This is a pristine, exploding field that is flooding biologists with a new wave of data, each with its own specificities in terms of complexity and information content. The integrative analysis of genomic data, collected at different molecular layers from diverse cell populations, holds promise to address the full-scale complexity of biological systems. However, the combination of different single-cell genomic signals is computationally challenging, as these data are intrinsically heterogeneous for experimental, technical and biological reasons. Here, we describe the computational methods for the integrative analysis of single-cell genomic data, with a focus on the integration of single-cell RNA sequencing datasets and on the joint analysis of multimodal signals from individual cells.

[1]  B. Berger,et al.  SCHEMA: A general framework for integrating heterogeneous single-cell modalities , 2019, bioRxiv.

[2]  G. Sanguinetti,et al.  scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells , 2018, Nature Communications.

[3]  S. Raychaudhuri,et al.  Multimodal single-cell approaches shed light on T cell heterogeneity. , 2019, Current opinion in immunology.

[4]  Sohan Seth,et al.  scID: Identification of transcriptionally equivalent cell populations across single cell RNA-seq data using discriminant analysis , 2018 .

[5]  Luke Zappia,et al.  Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database , 2017, bioRxiv.

[6]  Yu Jiang,et al.  A Selective Review of Multi-Level Omics Data Integration Using Variable Selection , 2019, High-throughput.

[7]  Nataša Pržulj,et al.  Methods for biological data integration: perspectives and challenges , 2015, Journal of The Royal Society Interface.

[8]  Johann A. Gagnon-Bartsch,et al.  Using control genes to correct for unwanted variation in microarray data. , 2012, Biostatistics.

[9]  David Gomez-Cabrero,et al.  Building gene regulatory networks from scATAC-seq and scRNA-seq using Linked Self Organizing Maps , 2019, PLoS computational biology.

[10]  Oliver Stegle,et al.  MOFA+: a probabilistic framework for comprehensive integration of structured single-cell data , 2019, bioRxiv.

[11]  Paul J. Hoffman,et al.  Comprehensive Integration of Single-Cell Data , 2018, Cell.

[12]  Raphael Gottardo,et al.  Orchestrating single-cell analysis with Bioconductor , 2019, Nature Methods.

[13]  Bonnie Berger,et al.  Efficient integration of heterogeneous single-cell transcriptomes using Scanorama , 2019, Nature Biotechnology.

[14]  M. E. Shafer Cross-Species Analysis of Single-Cell Transcriptomic Data , 2019, Front. Cell Dev. Biol..

[15]  Shuang Wu,et al.  Evaluation of single-cell classifiers for single-cell RNA sequencing data sets , 2019, Briefings Bioinform..

[16]  Kieran R. Campbell,et al.  clonealign: statistical integration of independent single-cell RNA and DNA sequencing data from human cancers , 2019, Genome Biology.

[17]  William E. Allen,et al.  Three-dimensional intact-tissue sequencing of single-cell transcriptional states , 2018, Science.

[18]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[19]  Yuchen Yang,et al.  SMNN: Batch Effect Correction for Single-cell RNA-seq data via Supervised Mutual Nearest Neighbor Detection , 2019, bioRxiv.

[20]  Michael Q. Zhang,et al.  SuperCT: a supervised-learning framework for enhanced characterization of single-cell transcriptomic profiles , 2019, Nucleic acids research.

[21]  M. Reinders,et al.  A comparison of automatic cell identification methods for single-cell RNA sequencing data , 2019, Genome Biology.

[22]  Shila Ghazanfar,et al.  scMerge leverages factor analysis, stable expression, and pseudoreplication to merge multiple single-cell RNA-seq datasets , 2019, Proceedings of the National Academy of Sciences.

[23]  Alexander J. Hartemink,et al.  MATCHER: manifold alignment reveals correspondence between single cell transcriptome and epigenome dynamics , 2017, Genome Biology.

[24]  R. Satija,et al.  Integrative single-cell analysis , 2019, Nature Reviews Genetics.

[25]  Paul Hoffman,et al.  Integrating single-cell transcriptomic data across different conditions, technologies, and species , 2018, Nature Biotechnology.

[26]  Jan Hoinka,et al.  Subpopulation Detection and Their Comparative Analysis across Single-Cell Experiments with scPopCorn. , 2019, Cell systems.

[27]  Kamil Slowikowski,et al.  Fast, sensitive, and accurate integration of single cell data with Harmony , 2019, Nature Methods.

[28]  Evan Z. Macosko,et al.  Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity , 2019, Cell.

[29]  Nikolaus Rajewsky,et al.  The Drosophila embryo at single-cell transcriptome resolution , 2017, Science.

[30]  C. Ponting,et al.  Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity , 2015, Nature Methods.

[31]  Florian Wagner,et al.  Moana: A robust and scalable cell type classification framework for single-cell RNA-Seq data , 2018, bioRxiv.

[32]  Kok Siong Ang,et al.  A benchmark of batch-effect correction methods for single-cell RNA sequencing data , 2020, Genome Biology.

[33]  Sarah A Teichmann,et al.  A test metric for assessing single-cell RNA-seq batch correction , 2018, Nature Methods.

[34]  J. Marioni,et al.  High-throughput spatial mapping of single-cell RNA-seq data to tissue of origin , 2015, Nature Biotechnology.

[35]  Luyi Tian,et al.  CellBench: R/Bioconductor software for comparing single-cell RNA-seq analysis methods , 2019, Bioinform..

[36]  Lior Rokach,et al.  CaSTLe – Classification of single cells by transfer learning: Harnessing the power of publicly available single cell RNA sequencing experiments to annotate new experiments , 2018, PloS one.

[37]  Weidong Tian,et al.  A novel approach to remove the batch effect of single-cell data , 2019, Cell Discovery.

[38]  Sandipan Ray,et al.  Metabolic oscillations on the circadian time scale in Drosophila cells lacking clock genes , 2018, Molecular systems biology.

[39]  Matteo Pellegrini,et al.  ACTINN: automated identification of cell types in single cell RNA sequencing , 2019, Bioinform..

[40]  Cole Trapnell,et al.  Supervised classification enables rapid annotation of cell atlases , 2019, Nature Methods.

[41]  Alioune Ngom,et al.  A review on machine learning principles for multi-view biological data integration , 2016, Briefings Bioinform..

[42]  Karolis Leonavicius,et al.  Multi-omics at single-cell resolution: comparison of experimental and data fusion approaches. , 2019, Current opinion in biotechnology.

[43]  Hanlee P. Ji,et al.  scPred: accurate supervised method for cell-type classification from single-cell RNA-seq data , 2019, Genome Biology.

[44]  Evan Z. Macosko,et al.  Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution , 2019, Science.

[45]  Aedín C. Culhane,et al.  Dimension reduction techniques for the integrative analysis of multi-omics data , 2016, Briefings Bioinform..

[46]  Laleh Haghverdi,et al.  Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors , 2018, Nature Biotechnology.

[47]  Nathan C. Sheffield,et al.  Multi-Omics of Single Cells: Strategies and Applications , 2016, Trends in biotechnology.

[48]  I. Amit,et al.  Single-cell spatial reconstruction reveals global division of labor in the mammalian liver , 2016, Nature.

[49]  Yong Wang,et al.  Integrative analysis of single-cell genomics data by coupled nonnegative matrix factorizations , 2018, Proceedings of the National Academy of Sciences.

[50]  Fabian J Theis,et al.  Current best practices in single‐cell RNA‐seq analysis: a tutorial , 2019, Molecular systems biology.

[51]  G. Quon,et al.  scAlign: a tool for alignment, integration, and rare cell identification from scRNA-seq data , 2019, Genome Biology.

[52]  A. Regev,et al.  Spatial reconstruction of single-cell gene expression , 2015, Nature Biotechnology.

[53]  Zhi Huang,et al.  LAmbDA: label ambiguous domain adaptation dataset integration reduces batch effects and improves subtype detection , 2019, Bioinform..

[54]  Patrick Cahan,et al.  SingleCellNet: a computational tool to classify single cell RNA-Seq data across platforms and across species , 2018, bioRxiv.

[55]  J. Leek svaseq: removing batch effects and other unwanted noise from sequencing data , 2014, bioRxiv.

[56]  Shiliang Sun,et al.  A survey of multi-view machine learning , 2013, Neural Computing and Applications.

[57]  A. Oshlack,et al.  Splatter: simulation of single-cell RNA sequencing data , 2017, Genome Biology.

[58]  Martin J. Aryee,et al.  Integrated Single-Cell Analysis Maps the Continuous Regulatory Landscape of Human Hematopoietic Differentiation , 2018, Cell.

[59]  Andrew C. Adey,et al.  Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data. , 2018, Molecular cell.

[60]  Luyi Tian,et al.  Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments , 2019, Nature Methods.

[61]  Sheng Wang,et al.  Unifying single-cell annotations based on the Cell Ontology , 2019, bioRxiv.

[62]  Daniel Schnell,et al.  cellHarmony: cell-level matching and holistic comparison of single-cell transcriptomes , 2019, Nucleic acids research.

[63]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[64]  Kerstin B. Meyer,et al.  BBKNN: fast batch alignment of single cell transcriptomes , 2019, Bioinform..

[65]  J. Marioni,et al.  Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets , 2018, Molecular systems biology.

[66]  S. Preissl,et al.  Single-cell multimodal omics: the power of many , 2020, Nature Methods.

[67]  Philip Lijnzaad,et al.  CHETAH: a selective, hierarchical cell type identification method for single-cell RNA sequencing , 2019, Nucleic acids research.

[68]  Nir Friedman,et al.  Gene expression cartography , 2019, Nature.

[69]  M. Ritchie,et al.  Methods of integrating data to uncover genotype–phenotype interactions , 2015, Nature Reviews Genetics.

[70]  R. Irizarry,et al.  Missing data and technical variability in single‐cell RNA‐sequencing experiments , 2018, Biostatistics.

[71]  Christoph Hafemeister,et al.  Comprehensive integration of single cell data , 2018, bioRxiv.

[72]  Wuming Gong,et al.  A novel algorithm for the collective integration of single cell RNA-seq during embryogenesis , 2019, bioRxiv.

[73]  M. Hemberg,et al.  scmap: projection of single-cell RNA-seq data across data sets , 2018, Nature Methods.

[74]  Samuel Demharter,et al.  Joint analysis of heterogeneous single-cell RNA-seq dataset collections , 2019, Nature Methods.

[75]  S. Dudoit,et al.  Normalization of RNA-seq data using factor analysis of control genes or samples , 2014, Nature Biotechnology.