SnapATAC: A Comprehensive Analysis Package for Single Cell ATAC-seq

Identification of the cis-regulatory elements controlling cell-type specific gene expression patterns is essential for understanding the origin of cellular diversity. Conventional assays to map regulatory elements via open chromatin analysis of primary tissues is hindered by heterogeneity of the samples. Single cell analysis of transposase-accessible chromatin (scATAC-seq) can overcome this limitation. However, the high-level noise of each single cell profile and the large volumes of data could pose unique computational challenges. Here, we introduce SnapATAC, a software package for analyzing scATAC-seq datasets. SnapATAC can efficiently dissect cellular heterogeneity in an unbiased manner and map the trajectories of cellular states. Using the Nyström method, a sampling technique that generates the low rank embedding for large-scale dataset, SnapATAC can process data from up to a million cells. Furthermore, SnapATAC incorporates existing tools into a comprehensive package for analyzing single cell ATAC-seq dataset. As demonstration of its utility, SnapATAC was applied to 55,592 single-nucleus ATAC-seq profiles from the mouse secondary motor cortex. The analysis revealed ∼370,000 candidate regulatory elements in 31 distinct cell populations in this brain region and inferred candidate transcriptional regulators in each of the cell types.

[1]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[2]  Justin P Sandoval,et al.  Single-cell methylomes identify neuronal subtypes and regulatory elements in mammalian cortex , 2017, Science.

[3]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[4]  Cory Y. McLean,et al.  GREAT improves functional interpretation of cis-regulatory regions , 2010, Nature Biotechnology.

[5]  William J. Greenleaf,et al.  chromVAR: Inferring transcription factor-associated accessibility from single-cell epigenomic data , 2017, Nature Methods.

[6]  Ameet Talwalkar,et al.  Ensemble Nystrom Method , 2009, NIPS.

[7]  Bing Ren,et al.  Systematic mapping of chromatin state landscapes during mouse development , 2017, bioRxiv.

[8]  Christoph Hafemeister,et al.  Comprehensive integration of single cell data , 2018, bioRxiv.

[9]  Stein Aerts,et al.  cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data , 2019, Nature Methods.

[10]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[11]  P. Kharchenko,et al.  Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain , 2017, Nature Biotechnology.

[12]  Hannah A. Pliner,et al.  The cis-regulatory dynamics of embryonic development at single cell resolution , 2017, Nature.

[13]  Andrew C. Adey,et al.  Multiplex single-cell profiling of chromatin accessibility by combinatorial cellular indexing , 2015, Science.

[14]  Z Josh Huang,et al.  The diversity of GABAergic neurons and neural communication elements , 2019, Nature Reviews Neuroscience.

[15]  Ansuman T. Satpathy,et al.  Coupled Single-Cell CRISPR Screening and Epigenomic Profiling Reveals Causal Gene Regulatory Networks , 2018, Cell.

[16]  V. Corces,et al.  CTCF: Master Weaver of the Genome , 2009, Cell.

[17]  Richard A. Muscat,et al.  Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding , 2018, Science.

[18]  Staci A. Sorensen,et al.  Adult Mouse Cortical Cell Taxonomy Revealed by Single Cell Transcriptomics , 2016 .

[19]  Garreck H. Lenz,et al.  Enhancer viruses and a transgenic platform for combinatorial cell subclass-specific labeling , 2019 .

[20]  Ameet Talwalkar,et al.  Sampling Methods for the Nyström Method , 2012, J. Mach. Learn. Res..

[21]  Aviv Regev,et al.  BROCKMAN: deciphering variance in epigenomic regulators by k-mer factorization , 2018, BMC Bioinformatics.

[22]  Howard Y. Chang,et al.  Massively parallel single-cell chromatin landscapes of human immune cell development and intratumoral T cell exhaustion , 2019, bioRxiv.

[23]  Lee E. Edsall,et al.  A map of the cis-regulatory sequences in the mouse genome , 2012, Nature.

[24]  Fan Zhang,et al.  Fast, sensitive, and accurate integration of single cell data with Harmony , 2018, bioRxiv.

[25]  Sandy L. Klemm,et al.  High-throughput chromatin accessibility profiling at single-cell resolution , 2018, Nature Communications.

[26]  Andrew C. Adey,et al.  Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data. , 2018, Molecular cell.

[27]  K. Tomita,et al.  bHLH transcription factors and mammalian neuronal differentiation. , 1997, The international journal of biochemistry & cell biology.

[28]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[29]  R. Satija,et al.  Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression , 2019, Genome Biology.

[30]  F. Gage,et al.  Mechanisms and Functional Implications of Adult Neurogenesis , 2008, Cell.

[31]  Lars E. Borm,et al.  Molecular Architecture of the Mouse Nervous System , 2018, Cell.

[32]  Christoph Hafemeister,et al.  Developmental diversification of cortical inhibitory interneurons , 2017, Nature.

[33]  Åsa K. Björklund,et al.  Tn5 transposase and tagmentation procedures for massively scaled sequencing projects , 2014, Genome research.

[34]  Howard Y. Chang,et al.  Single-cell chromatin accessibility reveals principles of regulatory variation , 2015, Nature.

[35]  R. Satija,et al.  Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression , 2019, Genome Biology.

[36]  Allon M Klein,et al.  Scrublet: Computational Identification of Cell Doublets in Single-Cell Transcriptomic Data. , 2019, Cell systems.

[37]  Russell B. Fletcher,et al.  Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics , 2017, BMC Genomics.

[38]  Ian R. Wickersham,et al.  The BRAIN Initiative Cell Census Consortium: Lessons Learned toward Generating a Comprehensive Brain Cell Atlas , 2017, Neuron.

[39]  Howard Y. Chang,et al.  Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position , 2013, Nature Methods.

[40]  Wei Xie,et al.  The landscape of accessible chromatin in mammalian preimplantation embryos , 2016, Nature.

[41]  Steven L. Brunton,et al.  Diffusion Maps meet Nyström , 2018, ArXiv.

[42]  Z. Weng,et al.  High-Resolution Mapping and Characterization of Open Chromatin across the Genome , 2008, Cell.

[43]  William S. DeWitt,et al.  A Single-Cell Atlas of In Vivo Mammalian Chromatin Accessibility , 2018, Cell.

[44]  D. Dickel,et al.  Single-nucleus analysis of accessible chromatin in developing mouse forebrain reveals cell-type-specific transcriptional regulation , 2018, Nature Neuroscience.

[45]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[46]  Miguel A. Andrade-Navarro,et al.  Assessment of computational methods for the analysis of single-cell ATAC-seq data , 2019, Genome Biology.

[47]  Andrew L. Ferguson,et al.  Landmark diffusion maps (L-dMaps): Accelerated manifold learning out-of-sample extension , 2017, Applied and Computational Harmonic Analysis.

[48]  David R. Powell,et al.  From reads to insight: a hitchhiker’s guide to ATAC-seq data analysis , 2020, Genome Biology.

[49]  Inna Dubchak,et al.  VISTA Enhancer Browser—a database of tissue-specific human enhancers , 2006, Nucleic Acids Res..

[50]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[51]  Garreck H. Lenz,et al.  Prospective, brain-wide labeling of neuronal subclasses with enhancer-driven AAVs , 2019, bioRxiv.

[52]  Matthew D. Schultz,et al.  Global Epigenomic Reconfiguration During Mammalian Brain Development , 2013, Science.

[53]  Martin J. Aryee,et al.  Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility , 2019, Nature Biotechnology.

[54]  R. Andrews,et al.  Innate Immune Activity Conditions the Effect of Regulatory Variants upon Monocyte Gene Expression , 2014, Science.