A Bayesian model for single cell transcript expression analysis on MERFISH data

Motivation Multiplexed error‐robust fluorescence in‐situ hybridization (MERFISH) is a recent technology to obtain spatially resolved gene or transcript expression profiles in single cells for hundreds to thousands of genes in parallel. So far, no statistical framework to analyze MERFISH data is available. Results We present a Bayesian model for single cell transcript expression analysis on MERFISH data. We show that the model successfully captures uncertainty in MERFISH data and eliminates systematic biases that can occur in raw RNA molecule counts obtained with MERFISH. Our model accurately estimates transcript expression and additionally provides the full probability distribution and credible intervals for each transcript. We further show how this enables MERFISH to scale towards the whole genome while being able to control the uncertainty in obtained results. Availability and implementation The presented model is implemented on top of Rust‐Bio (Köster, 2016) and available open‐source as MERFISHtools (https://merfishtools.github.io). It can be easily installed via Bioconda (Grüning et al., 2018). The entire analysis performed in this paper is provided as a fully reproducible Snakemake (Köster and Rahmann, 2012) workflow via Zenodo (https://doi.org/10.5281/zenodo.752340). Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Tal Nawy,et al.  Single-cell sequencing , 2013, Nature Methods.

[2]  Kieran R. Campbell,et al.  Order under uncertainty: robust differential expression analysis using probabilistic models for pseudotime inference , 2016 .

[3]  S. Quake,et al.  A survey of human brain transcriptome diversity at the single cell level , 2015, Proceedings of the National Academy of Sciences.

[4]  Sean C. Bendall,et al.  Multiplexed ion beam imaging of human breast tumors , 2014, Nature Medicine.

[5]  F S Fay,et al.  Visualization of single RNA transcripts in situ. , 1998, Science.

[6]  U Landegren,et al.  Padlock probes: circularizing oligonucleotides for localized DNA detection. , 1994, Science.

[7]  D. Curran‐Everett,et al.  The fickle P value generates irreproducible results , 2015, Nature Methods.

[8]  J. Buhmann,et al.  Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry , 2014, Nature Methods.

[9]  Hazen P Babcock,et al.  High-throughput single-cell gene-expression profiling with multiplexed error-robust fluorescence in situ hybridization , 2016, Proceedings of the National Academy of Sciences.

[10]  Renan Valieris,et al.  Bioconda: sustainable and comprehensive software distribution for the life sciences , 2018, Nature Methods.

[11]  A. Oudenaarden,et al.  Single-molecule mRNA detection and counting in mammalian tissue , 2013, Nature Protocols.

[12]  Sven Rahmann,et al.  Genome analysis , 2022 .

[13]  Shawn M. Gillespie,et al.  Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma , 2014, Science.

[14]  Timur Zhiyentayev,et al.  Single-cell in situ RNA profiling by sequential hybridization , 2014, Nature Methods.

[15]  X. Zhuang,et al.  Spatially resolved, highly multiplexed RNA profiling in single cells , 2015, Science.

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[18]  Rahul Satija,et al.  MERFISHing for spatial context. , 2015, Trends in immunology.

[19]  Alexander van Oudenaarden,et al.  Spatially resolved transcriptomics and beyond , 2014, Nature Reviews Genetics.

[20]  Johannes Köster,et al.  Rust-Bio: a fast and safe bioinformatics library , 2015, Bioinform..

[21]  Junhyong Kim,et al.  The promise of single-cell sequencing , 2013, Nature Methods.