Destin: toolkit for single-cell analysis of chromatin accessibility

Abstract Summary Single-cell assay of transposase-accessible chromatin followed by sequencing (scATAC-seq) is an emerging new technology for the study of gene regulation with single-cell resolution. The data from scATAC-seq are unique—sparse, binary and highly variable even within the same cell type. As such, neither methods developed for bulk ATAC-seq nor single-cell RNA-seq data are appropriate. Here, we present Destin, a bioinformatic and statistical framework for comprehensive scATAC-seq data analysis. Destin performs cell-type clustering via weighted principle component analysis, weighting accessible chromatin regions by existing genomic annotations and publicly available regulomic datasets. The weights and additional tuning parameters are determined via model-based likelihood. We evaluated the performance of Destin using downsampled bulk ATAC-seq data of purified samples and scATAC-seq data from seven diverse experiments. Compared to existing methods, Destin was shown to outperform across all datasets and platforms. For demonstration, we further applied Destin to 2088 adult mouse forebrain cells and identified cell-type-specific association of previously reported schizophrenia GWAS loci. Availability and implementation Destin toolkit is freely available as an R package at https://github.com/urrutiag/destin. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[2]  Warren W. Kretzschmar,et al.  Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression , 2017, Nature Genetics.

[3]  William J. Greenleaf,et al.  chromVAR: Inferring transcription factor-associated accessibility from single-cell epigenomic data , 2017, Nature Methods.

[4]  C. Spencer,et al.  Biological Insights From 108 Schizophrenia-Associated Genetic Loci , 2014, Nature.

[5]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[6]  Howard Y. Chang,et al.  Single-cell chromatin accessibility reveals principles of regulatory variation , 2015, Nature.

[7]  Zhicheng Ji,et al.  Single-cell regulome data analysis by SCRAT , 2017, Bioinform..

[8]  Seth G. N. Grant,et al.  Identification of Vulnerable Cell Types in Major Brain Disorders Using Single Cell Transcriptomes and Expression Weighted Cell Type Enrichment , 2016, Front. Neurosci..

[9]  Gerome Breen,et al.  Genetic identification of brain cell types underlying schizophrenia , 2017, Nature Genetics.

[10]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[11]  Alicia N. Schep,et al.  Unsupervised clustering and epigenetic classification of single cells , 2017, Nature Communications.

[12]  Joris M. Mooij,et al.  MAGMA: Generalized Gene-Set Analysis of GWAS Data , 2015, PLoS Comput. Biol..

[13]  Alicia R. Martin,et al.  Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder , 2018, Nature Genetics.

[14]  Jakob Grove,et al.  Discovery of the first genome-wide significant risk loci for ADHD , 2017, bioRxiv.

[15]  Howard Y. Chang,et al.  Lineage-specific and single cell chromatin accessibility charts human hematopoiesis and leukemia evolution , 2016, Nature Genetics.

[16]  D. Dickel,et al.  Single-nucleus analysis of accessible chromatin in developing mouse forebrain reveals cell-type-specific transcriptional regulation , 2018, Nature Neuroscience.

[17]  Andrew C. Adey,et al.  Multiplex single-cell profiling of chromatin accessibility by combinatorial cellular indexing , 2015, Science.