Learning interpretable cellular responses to complex perturbations in high-throughput screens

Recent advances in multiplexed single-cell transcriptomics experiments are facilitating the high-throughput study of drug and genetic perturbations. However, an exhaustive exploration of the combinatorial perturbation space is experimentally unfeasible, so computational methods are needed to predict, interpret, and prioritize perturbations. Here, we present the compositional perturbation autoencoder (CPA), which combines the interpretability of linear models with the flexibility of deep-learning approaches for single-cell response modeling. CPA encodes and learns transcriptional drug responses across different cell type, dose, and drug combinations. The model produces easy-to-interpret embeddings for drugs and cell types, which enables drug similarity analysis and predictions for unseen dosage and drug combinations. We show that CPA accurately models single-cell perturbations across compounds, doses, species, and time. We further demonstrate that CPA predicts combinatorial genetic interactions of several types, implying that it captures features that distinguish different interaction programs. Finally, we demonstrate that CPA can generate in-silico 5,329 missing genetic combination perturbations (97.6% of all possibilities) with diverse genetic interactions. We envision our model will facilitate efficient experimental design and hypothesis generation by enabling in-silico response prediction at the single-cell level, and thus accelerate therapeutic applications using single-cell technologies.

[1]  Samantha A. Morris,et al.  Dissecting cell identity via network inference and in silico gene perturbation , 2023, Nature.

[2]  Hatice S. Kaya-Okur,et al.  Single-cell CUT&Tag analysis of chromatin modifications in differentiation and tumor progression , 2021, Nature Biotechnology.

[3]  A. Regev,et al.  Multimodal pooled Perturb-CITE-seq screens in patient models define mechanisms of cancer immune evasion , 2021, Nature Genetics.

[4]  Fabian J Theis,et al.  Conditional out-of-distribution generation for unpaired data using transfer VAE. , 2020, Bioinformatics.

[5]  C. Sander,et al.  CellBox: Interpretable Machine Learning for Perturbation Biology with Application to the Design of Cancer Combination Therapy. , 2020, Cell systems.

[6]  Andrew J. Hill,et al.  A human cell atlas of fetal chromatin accessibility , 2020, Science.

[7]  N. Yosef,et al.  Enhancing scientific discoveries in molecular biology with deep generative models , 2020, Molecular systems biology.

[8]  Fabian J. Theis,et al.  Query to reference single-cell integration with transfer learning , 2020, bioRxiv.

[9]  Bertrand Z. Yeung,et al.  Characterizing the molecular regulation of inhibitory immune checkpoints with multi-modal single-cell screens. , 2020, Nature Genetics.

[10]  Dan Zhang,et al.  Construction of a human cell landscape at single-cell level , 2020, Nature.

[11]  A. van Oudenaarden,et al.  Single-cell and spatial transcriptomics reveal somitogenesis in gastruloids , 2020, Nature.

[12]  I. Amit,et al.  Single-cell genomic approaches for developing the next generation of immunotherapies , 2020, Nature Medicine.

[13]  Fabian J Theis,et al.  Targeted pharmacological therapy restores β-cell function for diabetes remission , 2020, Nature Metabolism.

[14]  Lior Pachter,et al.  Highly multiplexed single-cell RNA-seq by DNA oligonucleotide tagging of cellular proteins , 2019, Nature Biotechnology.

[15]  Jonathan S. Packer,et al.  Massively multiplex chemical transcriptomics at single-cell resolution , 2019, Science.

[16]  N. Russkikh,et al.  Style transfer with variational autoencoders is a promising approach to RNA-Seq data harmonization and analysis , 2019, bioRxiv.

[17]  Kun Zhang,et al.  High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell , 2019, Nature Biotechnology.

[18]  Thomas M. Norman,et al.  Exploring genetic interaction manifolds constructed from rich single-cell phenotypes , 2019, Science.

[19]  Mohammad Lotfollahi,et al.  scGen predicts single-cell perturbation responses , 2019, Nature Methods.

[20]  Amir K. Foroushani,et al.  Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen , 2019, Nature Communications.

[21]  Jennifer L Hu,et al.  MULTI-seq: sample multiplexing for single-cell RNA sequencing using lipid-tagged indices , 2019, Nature Methods.

[22]  Adam C Mater,et al.  Deep Learning in Chemistry , 2019, J. Chem. Inf. Model..

[23]  Angela Oliveira Pisco,et al.  A Single Cell Transcriptomic Atlas Characterizes Aging Tissues in the Mouse , 2019, bioRxiv.

[24]  Olli Yli-Harja,et al.  Systems Pharmacogenomic Landscape of Drug Similarities from LINCS data: Drug Association Networks , 2019, Scientific Reports.

[25]  Benjamin Haibe-Kains,et al.  Dr.VAE: improving drug response prediction via modeling of drug perturbation effects , 2019, Bioinform..

[26]  Hatice S. Kaya-Okur,et al.  CUT&Tag for efficient epigenomic profiling of small samples and single cells , 2019, Nature Communications.

[27]  Evan Z. Macosko,et al.  Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution , 2019, Science.

[28]  Fabian J Theis,et al.  Single-cell RNA-seq denoising using a deep count autoencoder , 2019, Nature Communications.

[29]  Daniel Weindl,et al.  Efficient Parameter Estimation Enables the Prediction of Drug Response Using a Mechanistic Pan-Cancer Pathway Model. , 2018, Cell systems.

[30]  Michael I. Jordan,et al.  Deep Generative Modeling for Single-cell Transcriptomics , 2018, Nature Methods.

[31]  G. Sanguinetti,et al.  scNMT-seq enables joint profiling of chromatin accessibility DNA methylation and transcription in single cells , 2018, Nature Communications.

[32]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[33]  Fabian J Theis,et al.  SCANPY: large-scale single-cell gene expression data analysis , 2018, Genome Biology.

[34]  Guillaume Lample,et al.  Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.

[35]  Ricardo J. Miragaia,et al.  Gene expression variability across cells and species shapes innate immunity , 2017, Nature.

[36]  Thomas M. Norman,et al.  Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens , 2016, Cell.

[37]  André F. Rendeiro,et al.  Pooled CRISPR screening with single-cell transcriptome read-out , 2017, Nature Methods.

[38]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[39]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[40]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[41]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[42]  B. Al-Lazikani,et al.  Combinatorial drug therapy for cancer in the post-genomic era , 2012, Nature Biotechnology.

[43]  Xiaohua Ma,et al.  Mechanisms of drug combinations: interaction and network perspectives , 2009, Nature Reviews Drug Discovery.

[44]  D. Nam,et al.  Application of single-cell RNA sequencing in optimizing a combinatorial therapeutic strategy in metastatic renal cell carcinoma , 2016, Genome Biology.

[45]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .