Predicting cellular position in the Drosophila embryo from Single-Cell Transcriptomics data

Single-cell RNA-seq (scRNAseq) technologies are rapidly evolving and a growing number of datasets are now available. While very informative, in standard scRNAseq experiments the spatial organization of the cells in the organism or tissue of origin is lost. Conversely, spatial RNA-seq technologies designed to keep the localization of the cells have limited throughput and gene coverage. Mapping scRNAseq to data of genes with spatial information can thus increase coverage while providing spatial location. However, methods to perform such a mapping are still in their infancy and have not been benchmarked in an unbiased manner. To bridge the gap, we organized the DREAM Single-Cell Transcriptomics challenge to evaluate methods for reconstructing the spatial arrangement of single cells from single-cell RNA sequencing data. The challenge focused on the spatial reconstruction of cells from the Drosophila embryo from single-cell transcriptomic and, leveraging as gold standard, in situ hybridization data of a set of selected driver genes from the Berkeley Drosophila Transcription Network Project reference atlas. The 34 participating teams used an array of different algorithms for gene selection and location prediction. We devised a novel scoring and cross-validation scheme to evaluate the robustness of the best performing algorithms. Participants were able to correctly and robustly localize rare subpopulations of cells, accurately mapping both spatially co-localized and scattered groups of cells. The selection of predictor genes was essential for accurately locating the cells in the embryo. Among the most frequently selected set of genes we measured a relatively high expression entropy, high spatial clustering and the presence of prominent developmental genes such as gap and pair-ruled genes and tissue defining markers.

[1]  John A Wolf,et al.  Transcriptome In Vivo Analysis (TIVA) of spatially defined single cells in intact live mouse and human brain tissue , 2014, Nature Methods.

[2]  P. Kharchenko,et al.  Bayesian approach to single-cell differential expression analysis , 2014, Nature Methods.

[3]  Gasper Tkacik,et al.  Positional information, in bits , 2010, Proceedings of the National Academy of Sciences.

[4]  Mary E. Edgerton,et al.  Multiclonal Invasion in Breast Tumors Identified by Topographic Single Cell Sequencing , 2018, Cell.

[5]  Mariela D. Petkova,et al.  Optimal Decoding of Cellular Identities in a Genetic Network , 2016, Cell.

[6]  James B. Brown,et al.  Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions , 2009, Genome Biology.

[7]  Michael Boutros,et al.  Gene expression atlas of a developing tissue by single cell expression correlation analysis , 2018, Nature Methods.

[8]  J. Marioni,et al.  High-throughput spatial mapping of single-cell RNA-seq data to tissue of origin , 2015, Nature Biotechnology.

[9]  Evan Z. Macosko,et al.  Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution , 2019, Science.

[10]  A. Regev,et al.  Spatial reconstruction of single-cell gene expression , 2015, Nature Biotechnology.

[11]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[12]  Amir K. Foroushani,et al.  Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen , 2019, Nature Communications.

[13]  D. Griffith Spatial Autocorrelation , 2020, Spatial Analysis Methods and Practice.

[14]  Christoph Hafemeister,et al.  Comprehensive integration of single cell data , 2018, bioRxiv.

[15]  Patrik L. Ståhl,et al.  Visualization and analysis of gene expression in tissue sections by spatial transcriptomics , 2016, Science.

[16]  Nir Friedman,et al.  Charting a tissue from single-cell transcriptomes , 2018 .

[17]  S. Friend,et al.  Crowdsourcing biomedical research: leveraging communities as innovation engines , 2016, Nature Reviews Genetics.

[18]  Dongjoo Kim,et al.  High-Spatial-Resolution Multi-Omics Atlas Sequencing of Mouse Embryos via Deterministic Barcoding in Tissue , 2020 .

[19]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[20]  David M. Umulis,et al.  Robustness of embryonic spatial patterning in Drosophila melanogaster. , 2008, Current topics in developmental biology.

[21]  Salah Ayoub,et al.  The Drosophila Embryo at Single Cell Transcriptome Resolution , 2017, bioRxiv.

[22]  R. Sokal,et al.  Spatial autocorrelation in biology: 1. Methodology , 1978 .

[23]  Trupti M. Kodinariya,et al.  Review on determining number of Cluster in K-Means Clustering , 2013 .

[24]  Nikolaus Rajewsky,et al.  The Drosophila embryo at single-cell transcriptome resolution , 2017, Science.

[25]  I. Amit,et al.  Single-cell spatial reconstruction reveals global division of labor in the mammalian liver , 2016, Nature.

[26]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[27]  Charless C. Fowlkes,et al.  A Quantitative Spatiotemporal Atlas of Gene Expression in the Drosophila Blastoderm , 2008, Cell.

[28]  Toma Tebaldi,et al.  High-Spatial-Resolution Multi-Omics Atlas Sequencing of Mouse Embryos via Deterministic Barcoding in Tissue , 2019, bioRxiv.

[29]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[30]  C. Lingard,et al.  Book Review: The Challenge of Red China , 1946 .

[31]  Monte Westerfield,et al.  ZFIN, the Zebrafish Model Organism Database: increased support for mutants and transgenics , 2012, Nucleic Acids Res..

[32]  Juan Carlos Fernández,et al.  Multiobjective evolutionary algorithms to identify highly autocorrelated areas: the case of spatial distribution in financially compromised farms , 2014, Ann. Oper. Res..