cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data

We present cisTopic, a probabilistic framework used to simultaneously discover coaccessible enhancers and stable cell states from sparse single-cell epigenomics data (http://github.com/aertslab/cistopic). Using a compendium of single-cell ATAC-seq datasets from differentiating hematopoietic cells, brain and transcription factor perturbations, we demonstrate that topic modeling can be exploited for robust identification of cell types, enhancers and relevant transcription factors. cisTopic provides insight into the mechanisms underlying regulatory heterogeneity in cell populations.As an unsupervised Bayesian framework, cisTopic classifies regions in scATAC-seq data into regulatory topics, which are used for clustering.

[1]  Andrew C. Adey,et al.  Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data. , 2018, Molecular cell.

[2]  Bart De Moor,et al.  TOUCAN 2: the all-inclusive open source workbench for regulatory sequence analysis , 2005, Nucleic Acids Res..

[3]  Howard Y. Chang,et al.  Single-cell chromatin accessibility reveals principles of regulatory variation , 2015, Nature.

[4]  G. Ming,et al.  Neuronal activity modifies the chromatin accessibility landscape in the adult brain , 2017, Nature Neuroscience.

[5]  Caroline L. Speck,et al.  Runx1-mediated hematopoietic stem-cell emergence is controlled by a Gata/Ets/SCL-regulated enhancer. , 2007, Blood.

[6]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[7]  B. De Moor,et al.  Toucan: deciphering the cis-regulatory logic of coregulated genes. , 2003, Nucleic acids research.

[8]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[9]  E. Koonin,et al.  A unique role for DNA (hydroxy)methylation in epigenetic regulation of human inhibitory neurons , 2018, Science Advances.

[10]  David J. Arenillas,et al.  JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles , 2009, Nucleic Acids Res..

[11]  Justin P Sandoval,et al.  Single-cell methylomes identify neuronal subtypes and regulatory elements in mammalian cortex , 2017, Science.

[12]  S. Aerts,et al.  i-cisTarget: an integrative genomics method for the prediction of regulatory features and cis-regulatory modules , 2012, Nucleic acids research.

[13]  A. Bernd,et al.  Levels of dopachrome tautomerase in human melanocytes cultured in vitro , 1994, Melanoma research.

[14]  William J. Greenleaf,et al.  chromVAR: Inferring transcription factor-associated accessibility from single-cell epigenomic data , 2017, Nature Methods.

[15]  Zhicheng Ji,et al.  Single-cell regulome data analysis by SCRAT , 2017, Bioinform..

[16]  Aviv Regev,et al.  BROCKMAN: deciphering variance in epigenomic regulators by k-mer factorization , 2018, BMC Bioinformatics.

[17]  S. Aerts,et al.  Decoding the regulatory landscape of melanoma reveals TEADS as regulators of the invasive cell state , 2015, Nature Communications.

[18]  Stein Aerts,et al.  i-cisTarget 2015 update: generalized cis-regulatory enrichment analysis in human, mouse and fly , 2015, Nucleic Acids Res..

[19]  Kamaleldin E Elagib,et al.  RUNX1 and GATA-1 coexpression and cooperation in megakaryocytic differentiation. , 2003, Blood.

[20]  Qing-Yu He,et al.  ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization , 2015, Bioinform..

[21]  D. Saluja,et al.  PU.1 and partners: regulation of haematopoietic stem cell fate in normal and malignant haematopoiesis , 2009, Journal of cellular and molecular medicine.

[22]  Tae Kyung Kim,et al.  Layer-specific chromatin accessibility landscapes reveal regulatory networks in adult mouse visual cortex , 2017, eLife.

[23]  F. A. Kolpakov,et al.  HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis , 2017, Nucleic Acids Res..

[24]  Howard Y. Chang,et al.  Lineage-specific and single cell chromatin accessibility charts human hematopoiesis and leukemia evolution , 2016, Nature Genetics.

[25]  S. Aerts,et al.  Transcription factor MITF and remodeller BRG1 define chromatin organisation at regulatory elements in melanoma cells , 2015, eLife.

[26]  Howard Y. Chang,et al.  Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position , 2013, Nature Methods.

[27]  Elin Axelsson,et al.  Essential role of EBF1 in the generation and function of distinct mature B cell types , 2012, The Journal of experimental medicine.

[28]  Manfred Lehner,et al.  Transcription Factor E2-2 Is an Essential and Specific Regulator of Plasmacytoid Dendritic Cell Development , 2008, Cell.

[29]  W. Pavan,et al.  NRG1 / ERBB3 signaling in melanocyte development and melanoma: inhibition of differentiation and promotion of proliferation , 2009, Pigment cell & melanoma research.

[30]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[31]  J. Aerts,et al.  SCENIC: Single-cell regulatory network inference and clustering , 2017, Nature Methods.

[32]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[33]  Aviv Regev,et al.  Massively-parallel single nucleus RNA-seq with DroNc-seq , 2017, Nature Methods.

[34]  P. Linsley,et al.  MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data , 2015, Genome Biology.

[35]  Helge G. Roider,et al.  Transcription factor binding predictions using TRAP for the analysis of ChIP-seq data and regulatory SNPs , 2011, Nature Protocols.

[36]  Alicia N. Schep,et al.  Unsupervised clustering and epigenetic classification of single cells , 2017, Nature Communications.

[37]  Kate B. Cook,et al.  Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity , 2014, Cell.

[38]  Nicholas A. Sinnott-Armstrong,et al.  An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues , 2017, Nature Methods.

[39]  M. A. Everett,et al.  Role of tyrosinase as the determinant of pigmentation in cultured human melanocytes. , 1993, The Journal of investigative dermatology.

[40]  Andrew C. Adey,et al.  Multiplex single-cell profiling of chromatin accessibility by combinatorial cellular indexing , 2015, Science.

[41]  Bin Zhang,et al.  Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R , 2008, Bioinform..

[42]  Fabian J. Theis,et al.  destiny: diffusion maps for large-scale single-cell data in R , 2015, Bioinform..

[43]  Stein Aerts,et al.  iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections , 2014, PLoS Comput. Biol..

[44]  Panayiotis V. Benos,et al.  STAMP: a web tool for exploring DNA-binding motif similarities , 2007, Nucleic Acids Res..

[45]  Hubing Shi,et al.  MDM4 is a key therapeutic target in cutaneous melanoma , 2012, Nature Medicine.

[46]  S. Aerts,et al.  Mapping gene regulatory networks from single-cell omics data , 2018, Briefings in functional genomics.

[47]  Martin J. Aryee,et al.  Integrated Single-Cell Analysis Maps the Continuous Regulatory Landscape of Human Hematopoietic Differentiation , 2018, Cell.

[48]  Martin C. Frith,et al.  Cluster-Buster: finding dense clusters of motifs in DNA sequences , 2003, Nucleic Acids Res..

[49]  William S. DeWitt,et al.  A Single-Cell Atlas of In Vivo Mammalian Chromatin Accessibility , 2018, Cell.

[50]  D. Dickel,et al.  Single-nucleus analysis of accessible chromatin in developing mouse forebrain reveals cell-type-specific transcriptional regulation , 2018, Nature Neuroscience.

[51]  P. Kharchenko,et al.  Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain , 2017, Nature Biotechnology.

[52]  Terrence J. Sejnowski,et al.  Epigenomic Signatures of Neuronal Diversity in the Mammalian Brain , 2015, Neuron.

[53]  Kurt Hornik,et al.  topicmodels : An R Package for Fitting Topic Models , 2016 .

[54]  J. van Helden,et al.  RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets , 2011, Nucleic acids research.

[55]  Matt Taddy,et al.  On Estimation and Selection for Topic Models , 2011, AISTATS.