A deep adversarial variational autoencoder model for dimensionality reduction in single-cell RNA sequencing analysis

Background Single-cell RNA sequencing (scRNA-seq) is an emerging technology that can assess the function of an individual cell and cell-to-cell variability at the single cell level in an unbiased manner. Dimensionality reduction is an essential first step in downstream analysis of the scRNA-seq data. However, the scRNA-seq data are challenging for traditional methods due to their high dimensional measurements as well as an abundance of dropout events (that is, zero expression measurements). Results To overcome these difficulties, we propose DR-A (Dimensionality Reduction with Adversarial variational autoencoder), a data-driven approach to fulfill the task of dimensionality reduction. DR-A leverages a novel adversarial variational autoencoder-based framework, a variant of generative adversarial networks. DR-A is well-suited for unsupervised learning tasks for the scRNA-seq data, where labels for cell types are costly and often impossible to acquire. Compared with existing methods, DR-A is able to provide a more accurate low dimensional representation of the scRNA-seq data. We illustrate this by utilizing DR-A for clustering of scRNA-seq data. Conclusions Our results indicate that DR-A significantly enhances clustering performance over state-of-the-art methods.

[1]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[2]  Sreeram Kannan,et al.  ClusterGAN : Latent Space Clustering in Generative Adversarial Networks , 2018, AAAI.

[3]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[4]  Andrey Kazennov,et al.  The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology , 2016, Oncotarget.

[5]  E. Pierson,et al.  ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis , 2015, Genome Biology.

[6]  A. Oudenaarden,et al.  Validation of noise models for single-cell transcriptomics , 2014, Nature Methods.

[7]  Morteza Mardani,et al.  Deep Generative Adversarial Neural Networks for Compressive Sensing MRI , 2019, IEEE Transactions on Medical Imaging.

[8]  Evan Z. Macosko,et al.  Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets , 2015, Cell.

[9]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[10]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  S. Dudoit,et al.  A general and flexible method for signal extraction from single-cell RNA-seq data , 2018, Nature Communications.

[12]  Samuel L. Wolock,et al.  A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure. , 2016, Cell systems.

[13]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[14]  Evan Z. Macosko,et al.  A Molecular Census of Arcuate Hypothalamus and Median Eminence Cell Types , 2017, Nature Neuroscience.

[15]  Grace X. Y. Zheng,et al.  Massively parallel digital transcriptional profiling of single cells , 2016, Nature Communications.

[16]  Yue Zhang,et al.  Scalable preprocessing for sparse scRNA-seq data exploiting prior knowledge , 2018, Bioinform..

[17]  Michael I. Jordan,et al.  Deep Generative Modeling for Single-cell Transcriptomics , 2018, Nature Methods.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Bo Hu,et al.  Unsupervised Learning for Cell-Level Visual Representation in Histopathology Images With Generative Adversarial Networks , 2017, IEEE Journal of Biomedical and Health Informatics.

[20]  Huiqi Li,et al.  Synthesizing retinal and neuronal images with generative adversarial nets , 2018, Medical Image Anal..

[21]  Paul Kline,et al.  An easy guide to factor analysis , 1993 .

[22]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[23]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[24]  S. Linnarsson,et al.  Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq , 2015, Science.

[25]  Chulhee Lee,et al.  Feature extraction based on the Bhattacharyya distance , 2003, Pattern Recognit..

[26]  Sergey Nikolenko,et al.  druGAN: An Advanced Generative Adversarial Autoencoder Model for de Novo Generation of New Molecules with Desired Molecular Properties in Silico. , 2017, Molecular pharmaceutics.

[27]  M. Hemberg,et al.  Identifying cell populations with scRNASeq. , 2017, Molecular aspects of medicine.

[28]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[29]  Kevin R. Moon,et al.  Exploring single-cell data with deep multitasking neural networks , 2017, Nature Methods.

[30]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[31]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[32]  Richard A. Muscat,et al.  Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding , 2018, Science.

[33]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[34]  Lai Guan Ng,et al.  Dimensionality reduction for visualizing single-cell data using UMAP , 2018, Nature Biotechnology.