RNA-Bloom provides lightweight reference-free transcriptome assembly for single cells

We present RNA-Bloom, a de novo RNA-seq assembly algorithm that leverages the rich information content in single-cell transcriptome sequencing (scRNA-seq) data to reconstruct cell-specific isoforms. We benchmark RNA-Bloom’s performance against leading bulk RNA-seq assembly approaches, and illustrate its utility in detecting cell-specific gene fusion events using sequencing data from HiSeq-4000 and BGISEQ-500 platforms. We expect RNA-Bloom to boost the utility of scRNA-seq data, expanding what is informatically accessible now.

[1]  S. Teichmann,et al.  Comparative analysis of sequencing technologies for single-cell transcriptomics , 2019, Genome Biology.

[2]  Justin Chu,et al.  ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter , 2016, bioRxiv.

[3]  C. Vollmers,et al.  Tn5Prime, a Tn5 based 5′ capture method for single cell RNA-seq , 2017, bioRxiv.

[4]  Chenghang Zong,et al.  Effective detection of variation in single-cell transcriptomes using MATQ-seq , 2017, Nature Methods.

[5]  Liliana Florea,et al.  Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads , 2015, GigaScience.

[6]  Steven J. M. Jones,et al.  De novo assembly and analysis of RNA-seq data , 2010, Nature Methods.

[7]  Evan Z. Macosko,et al.  Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets , 2015, Cell.

[8]  Ana Conesa,et al.  Single-cell RNAseq for the study of isoforms—how is that possible? , 2018, Genome Biology.

[9]  Elena Bushmanova,et al.  rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data , 2018, bioRxiv.

[10]  Gene W. Yeo,et al.  Single-Cell Alternative Splicing Analysis with Expedition Reveals Splicing Dynamics during Neuron Differentiation. , 2017, Molecular cell.

[11]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[12]  Aleksandra A. Kolodziejczyk,et al.  Single Cell RNA-Sequencing of Pluripotent States Unlocks Modular Transcriptional Variation , 2015, Cell stem cell.

[13]  Rayan Chikhi,et al.  Space-efficient and exact de Bruijn graph representation based on a Bloom filter , 2012, Algorithms for Molecular Biology.

[14]  Timothy L. Tickle,et al.  STAR-Fusion: Fast and Accurate Fusion Transcript Detection from RNA-Seq , 2017, bioRxiv.

[15]  Justin Chu,et al.  Spaced Seed Data Structures for De Novo Assembly , 2015, International journal of genomics.

[16]  Elena Bushmanova,et al.  rnaQUAST: a quality assessment tool for de novo transcriptome assemblies , 2016, Bioinform..

[17]  Mandeep Singh,et al.  B-cell receptor reconstruction from single-cell RNA-seq with VDJPuzzle , 2017, bioRxiv.

[18]  Aly A. Khan,et al.  BASIC: BCR assembly from single cells , 2016, Bioinform..

[19]  Sarah A Teichmann,et al.  BraCeR: B-cell-receptor reconstruction and clonality inference from single-cell RNA-seq , 2018, Nature Methods.

[20]  Lior Pachter,et al.  Near-optimal probabilistic RNA-seq quantification , 2016, Nature Biotechnology.

[21]  Justin Chu,et al.  ntHash: recursive nucleotide hashing , 2016, Bioinform..

[22]  Justin Chu,et al.  TAP: a targeted clinical genomics pipeline for detecting transcript variants using RNA-seq data , 2018, BMC Medical Genomics.

[23]  Daniel J. Gaffney,et al.  A survey of best practices for RNA-seq data analysis , 2016, Genome Biology.

[24]  Grace X. Y. Zheng,et al.  Massively parallel digital transcriptional profiling of single cells , 2016, Nature Communications.

[25]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[26]  Åsa K. Björklund,et al.  Smart-seq2 for sensitive full-length transcriptome profiling in single cells , 2013, Nature Methods.

[27]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[28]  Hamid Mohamadi,et al.  ntCard: a streaming algorithm for cardinality estimation in genomics data , 2017, Bioinform..