Fast and precise single-cell data analysis using a hierarchical autoencoder

A primary challenge in single-cell RNA sequencing (scRNA-seq) studies comes from the massive amount of data and the excess noise level. To address this challenge, we introduce a hierarchical autoencoder that reliably extracts representative information of each cell. In an extensive analysis, we demonstrate that the approach vastly outperforms state-of-the-art techniques in many research sub-fields of scRNA-seq analysis, including cell segregation through unsupervised learning, visualization of transcriptome landscape, cell classification, and pseudo-time inference.

[1]  N. Navin,et al.  Advances and applications of single-cell sequencing technologies. , 2015, Molecular cell.

[2]  Evan Z. Macosko,et al.  A Molecular Census of Arcuate Hypothalamus and Median Eminence Cell Types , 2017, Nature Neuroscience.

[3]  Sean C. Bendall,et al.  viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia , 2013, Nature Biotechnology.

[4]  Y. Saeys,et al.  Computational flow cytometry: helping to make sense of high-dimensional immunology data , 2016, Nature Reviews Immunology.

[5]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[6]  Aviv Regev,et al.  A revised airway epithelial hierarchy includes CFTR-expressing ionocytes , 2018, Nature.

[7]  D. M. Smith,et al.  Single-Cell Transcriptome Profiling of Human Pancreatic Islets in Health and Type 2 Diabetes , 2016, Cell metabolism.

[8]  R. Sandberg,et al.  Single-Cell RNA-Seq Reveals Dynamic, Random Monoallelic Gene Expression in Mammalian Cells , 2014, Science.

[9]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[10]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  S. Linnarsson,et al.  Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq , 2015, Science.

[13]  Max Endele,et al.  Quantitative single-cell approaches to stem cell research. , 2014, Cell stem cell.

[14]  H. Binder,et al.  Multilineage communication regulates human liver bud development from pluripotency , 2017, Nature.

[15]  M. Schaub,et al.  SC3 - consensus clustering of single-cell RNA-Seq data , 2016, Nature Methods.

[16]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[17]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[18]  Shawn M. Gillespie,et al.  Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma , 2014, Science.

[19]  Hongkai Ji,et al.  TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis , 2016, Nucleic acids research.

[20]  M. Ronaghi,et al.  Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain , 2016, Science.

[21]  P. Verstreken,et al.  A Single-Cell Transcriptome Atlas of the Aging Drosophila Brain , 2018, Cell.

[22]  S. Teichmann,et al.  Computational and analytical challenges in single-cell transcriptomics , 2015, Nature Reviews Genetics.

[23]  Lai Guan Ng,et al.  Dimensionality reduction for visualizing single-cell data using UMAP , 2018, Nature Biotechnology.

[24]  Aleksandra A. Kolodziejczyk,et al.  Single Cell RNA-Sequencing of Pluripotent States Unlocks Modular Transcriptional Variation , 2015, Cell stem cell.

[25]  Mauro J. Muraro,et al.  A Single-Cell Transcriptome Atlas of the Human Pancreas , 2016, Cell systems.

[26]  Russell B. Fletcher,et al.  Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics , 2017, BMC Genomics.

[27]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[28]  Michael J. T. Stubbington,et al.  The Human Cell Atlas: from vision to reality , 2017, Nature.

[29]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[30]  M. Hemberg,et al.  Challenges in unsupervised clustering of single-cell RNA-seq data , 2019, Nature Reviews Genetics.

[31]  Evan Z. Macosko,et al.  Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets , 2015, Cell.

[32]  Allon M. Klein,et al.  Droplet Barcoding for Single-Cell Transcriptomics Applied to Embryonic Stem Cells , 2015, Cell.

[33]  J. Marioni,et al.  Heterogeneity in Oct4 and Sox2 Targets Biases Cell Fate in 4-Cell Mouse Embryos , 2016, Cell.

[34]  Joshua W. K. Ho,et al.  CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-Seq data , 2016 .

[35]  Ruiqiang Li,et al.  Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells , 2013, Nature Structural &Molecular Biology.

[36]  A. Regev,et al.  Scaling single-cell genomics from phenomenology to mechanism , 2017, Nature.

[37]  A. Saliba,et al.  Single-cell RNA-seq: advances and future challenges , 2014, Nucleic acids research.

[38]  Gabriel P López,et al.  Microfluidic cell sorting: a review of the advances in the separation of cells from debulking to rare cell isolation. , 2015, Lab on a chip.

[39]  Shibiao Wan,et al.  SHARP: Single-cell RNA-seq Hyper-fast and Accurate Processing via Ensemble Random Projection , 2018, bioRxiv.

[40]  J. Schug,et al.  Single-Cell Transcriptomics of the Human Endocrine Pancreas , 2016, Diabetes.

[41]  Madeline A. Lancaster,et al.  Human cerebral organoids recapitulate gene expression programs of fetal neocortex development , 2015, Proceedings of the National Academy of Sciences.

[42]  A. Murphy,et al.  RNA Sequencing of Single Human Islet Cells Reveals Type 2 Diabetes Genes. , 2016, Cell metabolism.

[43]  John C Marioni,et al.  Challenges in measuring and understanding biological noise , 2019, Nature Reviews Genetics.

[44]  Yi Zhang,et al.  Single-Cell RNA-Seq Reveals Hypothalamic Cell Diversity. , 2017, Cell reports.

[45]  Cole Trapnell,et al.  The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells , 2014, Nature Biotechnology.

[46]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[47]  M. Cugmas,et al.  On comparing partitions , 2015 .

[48]  A. Regev,et al.  Spatial reconstruction of single-cell gene expression data , 2015 .

[49]  Alex A. Pollen,et al.  Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex , 2014, Nature Biotechnology.

[50]  S. Linnarsson,et al.  Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing , 2014, Nature Neuroscience.

[51]  Ann B. Lee,et al.  Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[52]  S. Quake,et al.  A survey of human brain transcriptome diversity at the single cell level , 2015, Proceedings of the National Academy of Sciences.

[53]  Samuel L. Wolock,et al.  A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure. , 2016, Cell systems.

[54]  Yuchio Yanagawa,et al.  Molecular interrogation of hypothalamic organization reveals distinct dopamine neuronal subtypes , 2016, Nature Neuroscience.

[55]  Hui Wang,et al.  SINCERA: A Pipeline for Single-Cell RNA-Seq Profiling Analysis , 2015, PLoS Comput. Biol..

[56]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.