FITs: forest of imputation trees for recovering true signals in single-cell open chromatin profiles

The advent of single-cell open-chromatin profiling technology has facilitated the analysis of heterogeneity of activity of regulatory regions at single-cell resolution. However, stochasticity and availability of low amount of relevant DNA cause high drop-out rate and noise in single-cell open-chromatin profiles. We introduce here a robust method called as Forest of Imputation Trees (FITs) to recover original signals from highly sparse and noisy single-cell open-chromatin profiles. FITs makes a forest of imputation trees to avoid bias during the restoration of read-count matrices. It resolves the challenging issue of recovering open chromatin signals without blurring out information at genomic sites with cell-type-specific activity. FITs is generalized for wider applicability, especially for highly sparse read-count matrices. The superiority of FITs in recovering signals of minority cells also makes it highly useful for single-cell open-chromatin profile from in vivo samples. First made online as thesis work at https://repository.iiitd.edu.in/xmlui/handle/123456789/807

[1]  Hua Zhou,et al.  svt: Singular Value Thresholding in MATLAB. , 2017, Journal of statistical software.

[2]  Christine Nardini,et al.  Missing value estimation methods for DNA methylation data , 2019, Bioinform..

[3]  W. Reik,et al.  Single-cell epigenomics: powerful new methods for understanding gene regulation and cell identity , 2016, Genome Biology.

[4]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[5]  F. Tang,et al.  Single-cell methylome landscapes of mouse embryonic stem cells and early embryos analyzed using reduced representation bisulfite sequencing , 2013, Genome research.

[6]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[7]  William J. Greenleaf,et al.  chromVAR: Inferring transcription factor-associated accessibility from single-cell epigenomic data , 2017, Nature Methods.

[8]  Andrew C. Adey,et al.  Joint profiling of chromatin accessibility and gene expression in thousands of single cells , 2018, Science.

[9]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[10]  Cory Y. McLean,et al.  GREAT improves functional interpretation of cis-regulatory regions , 2010, Nature Biotechnology.

[11]  B. Ren,et al.  Mapping Human Epigenomes , 2013, Cell.

[12]  Stein Aerts,et al.  cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data , 2019, Nature Methods.

[13]  Martin J. Aryee,et al.  Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility , 2019, Nature Biotechnology.

[14]  Tao Jiang,et al.  SCALE method for single-cell ATAC-seq analysis via latent feature extraction , 2019, Nature Communications.

[15]  Andrew C. Adey,et al.  Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data. , 2018, Molecular cell.

[16]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[17]  P. Kharchenko,et al.  Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain , 2017, Nature Biotechnology.

[18]  Hannah A. Pliner,et al.  The cis-regulatory dynamics of embryonic development at single cell resolution , 2017, Nature.

[19]  P. Chu,et al.  CD79: a review. , 2001, Applied immunohistochemistry & molecular morphology : AIMM.

[20]  Hui Sun Leong,et al.  Longitudinal single-cell RNA sequencing of patient-derived primary cells reveals drug-induced infidelity in stem cell hierarchy , 2018, Nature Communications.

[21]  Kevin R. Moon,et al.  Recovering Gene Interactions from Single-Cell Data Using Data Diffusion , 2018, Cell.

[22]  Naoki Nariai,et al.  Pgltools: a genomic arithmetic tool suite for manipulation of Hi-C peak and other chromatin interaction data , 2017, BMC Bioinformatics.

[23]  Howard Y. Chang,et al.  Single-cell chromatin accessibility reveals principles of regulatory variation , 2015, Nature.

[24]  Neva C. Durand,et al.  Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. , 2016, Cell systems.

[25]  Kairong Cui,et al.  Principles of nucleosome organization revealed by single-cell micrococcal nuclease sequencing , 2018, Nature.

[26]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[27]  H. Ng,et al.  Comprehensive benchmarking reveals H2BK20 acetylation as a distinctive signature of cell-state-specific enhancers and promoters , 2016, Genome research.

[28]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[29]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[30]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[31]  S. Teichmann,et al.  Single cell RNA-seq and ATAC-seq analysis of cardiac progenitor cell transition states and lineage settlement , 2018, Nature Communications.

[32]  Wei Vivian Li,et al.  An accurate and robust imputation method scImpute for single-cell RNA-seq data , 2018, Nature Communications.

[33]  D. E. Olins,et al.  Nucleosome repositioning during differentiation of a human myeloid leukemia cell line , 2017, Nucleus.

[34]  William S. DeWitt,et al.  A Single-Cell Atlas of In Vivo Mammalian Chromatin Accessibility , 2018, Cell.

[35]  Timothy J. Durham,et al.  Systematic analysis of chromatin state dynamics in nine human cell types , 2011, Nature.

[36]  D. Weitz,et al.  Single-cell ChIP-seq reveals cell subpopulations defined by chromatin state , 2015, Nature Biotechnology.

[37]  J. Lupski,et al.  Non-coding genetic variants in human disease. , 2015, Human molecular genetics.

[38]  Laurens van der Maaten,et al.  Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[39]  Martin J. Aryee,et al.  Integrated Single-Cell Analysis Maps the Continuous Regulatory Landscape of Human Hematopoietic Differentiation , 2018, Cell.