Decoding single-cell multiomics: scMaui - A deep learning framework for uncovering cellular heterogeneity in presence of batch Effects and missing data

The recent advances in high-throughput single-cell sequencing has significantly required computational models which can address the high complexity of single-cell multiomics data. Meticulous single-cell multiomics integration models are required to avoid biases towards a specific modality and overcome the sparsity. Batch effects obfuscating biological signals must also be taken into account. Here, we introduce a new single-cell multiomics integration model, Single-cell Multiomics Autoencoder Integration (scMaui) based on stacked variational encoders and adversarial learning. scMaui reduces the dimensionality of integrated data modalities to a latent space which outlines cellular heterogeneity. It can handle multiple batch effects independently accepting both discrete and continuous values, as well as provides varied reconstruction loss functions to cover various assays and preprocessing pipelines. We show that scMaui accomplishes superior performance in many tasks compared to other methods. Further downstream analyses also demonstrate its potential in identifying relations between assays and discovering hidden subpopulations.

[1]  C. Plass,et al.  Systematic evaluation of cell-type deconvolution pipelines for sequencing-based bulk DNA methylomes , 2021, bioRxiv.

[2]  Alex S. Felmeister,et al.  Single-cell multiomics reveals increased plasticity, resistant populations, and stem-cell–like blasts in KMT2A-rearranged leukemia , 2021, Blood.

[3]  Yin Tang,et al.  Joint single-cell multiomic analysis in Wnt3a induced asymmetric stem cell division , 2021, Nature Communications.

[4]  K. Rogers,et al.  Spatial omics and multiplexed imaging to explore cancer biology , 2021, Nature Methods.

[5]  Yike Guo,et al.  XOmiVAE: an interpretable deep learning model for cancer classification using high-dimensional omics data , 2021, Briefings Bioinform..

[6]  Luonan Chen,et al.  Deep cross-omics cycle attention model for joint analysis of single-cell multi-omics data , 2021, Bioinform..

[7]  Santiago J. Carmona,et al.  Interpretation of T cell states from single-cell transcriptomics data using reference atlases , 2021, Nature Communications.

[8]  A. Akalin,et al.  Simultaneous dimensionality reduction and integration for single-cell ATAC-seq data using deep learning , 2021, Nature Machine Intelligence.

[9]  L. Elo,et al.  Computational strategies for single-cell multi-omics integration , 2021, Computational and structural biotechnology journal.

[10]  Matthew E. Ritchie,et al.  Single-cell analyses reveal the clonal and molecular aetiology of Flt3L-induced emergency dendritic cell development , 2021, Nature Cell Biology.

[11]  Aaron M. Streets,et al.  Joint probabilistic modeling of single-cell multi-omic data with totalVI , 2021, Nature Methods.

[12]  Luonan Chen,et al.  Deep-joint-learning analysis model of single cell transcriptome and open chromatin accessibility data , 2020, Briefings Bioinform..

[13]  Fabian J Theis,et al.  A sandbox for prediction and integration of DNA, RNA, and protein data in single cells , 2021 .

[14]  Dakang Xu,et al.  Redefining Tumor-Associated Macrophage Subpopulations and Functions in the Tumor Microenvironment , 2020, Frontiers in Immunology.

[15]  J. Qin,et al.  Integration of single-cell multi-omics for gene regulatory network inference , 2020, Computational and structural biotechnology journal.

[16]  J. Marioni,et al.  MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data , 2020, Genome Biology.

[17]  F. Drabløs,et al.  DNA hypermethylation associated with upregulated gene expression in prostate cancer demonstrates the diversity of epigenetic regulation , 2020, BMC Medical Genomics.

[18]  G. Sanguinetti,et al.  Multi-omics profiling of mouse gastrulation at single cell resolution , 2019, Nature.

[19]  Nirmal Keshava,et al.  Defining subpopulations of differential drug response to reveal novel target populations , 2019, npj Systems Biology and Applications.

[20]  M. Ehrlich DNA hypermethylation in disease: mechanisms and clinical relevance , 2019, Epigenetics.

[21]  Shila Ghazanfar,et al.  scMerge leverages factor analysis, stable expression, and pseudoreplication to merge multiple single-cell RNA-seq datasets , 2019, Proceedings of the National Academy of Sciences.

[22]  Fabian J Theis,et al.  PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells , 2019, Genome biology.

[23]  Kieran R. Campbell,et al.  clonealign: statistical integration of independent single-cell RNA and DNA sequencing data from human cancers , 2019, Genome Biology.

[24]  Kieran R. Campbell,et al.  clonealign: statistical integration of independent single-cell RNA and DNA sequencing data from human cancers , 2019, Genome Biology.

[25]  Christoph Hafemeister,et al.  Comprehensive integration of single cell data , 2018, bioRxiv.

[26]  R. Locksley,et al.  Innate Lymphoid Cells: 10 Years On , 2018, Cell.

[27]  R. Soffietti,et al.  STAT3 labels a subpopulation of reactive astrocytes required for brain metastasis , 2018, Nature Medicine.

[28]  C. Chazaud,et al.  Primitive Endoderm Differentiation: From Specification to Epithelialization. , 2018, Current topics in developmental biology.

[29]  Fabian J Theis,et al.  SCANPY: large-scale single-cell gene expression data analysis , 2018, Genome Biology.

[30]  Shicheng Guo,et al.  Targeted bisulfite sequencing identified a panel of DNA methylation-based biomarkers for esophageal squamous cell carcinoma (ESCC) , 2017, Clinical Epigenetics.

[31]  H. Swerdlow,et al.  Large-scale simultaneous measurement of epitopes and transcriptomes in single cells , 2017, Nature Methods.

[32]  K. Chow,et al.  Genomewide bisulfite sequencing reveals the origin and time-dependent fragmentation of urinary cfDNA. , 2017, Clinical biochemistry.

[33]  G. Sanguinetti,et al.  scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells , 2018, Nature Communications.

[34]  C. Ponting,et al.  Single-Cell Multiomics: Multiple Measurements from Single Cells , 2017, Trends in genetics : TIG.

[35]  M. Colonna,et al.  Immune modules shared by innate lymphoid cells and T cells. , 2016, The Journal of allergy and clinical immunology.

[36]  A. M. Arias,et al.  Transition states and cell fate decisions in epigenetic landscapes , 2016, Nature Reviews Genetics.

[37]  Davis J. McCarthy,et al.  A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor , 2016, F1000Research.

[38]  Fabian J Theis,et al.  Diffusion pseudotime robustly reconstructs lineage branching , 2016, Nature Methods.

[39]  Sean C. Bendall,et al.  Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis , 2015, Cell.

[40]  Michael Poidinger,et al.  Identification of cDC1- and cDC2-committed DC progenitors reveals early lineage priming at the common DC progenitor stage in the bone marrow , 2015, Nature Immunology.

[41]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[42]  Peiyong Jiang,et al.  Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing , 2013, Proceedings of the National Academy of Sciences.

[43]  B. Carlson Formation of Germ Layers and Early Derivatives , 2009 .

[44]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[45]  Margaret Comerford Freda,et al.  10 Years. , 2008, MCN. The American journal of maternal child nursing.