Machine Learning for Uncovering Biological Insights in Spatial Transcriptomics Data

Development and homeostasis in multicellular systems both require exquisite control over spatial molecular pattern formation and maintenance. Advances in spatially-resolved and high-throughput molecular imaging methods such as multiplexed immunofluorescence and spatial transcriptomics (ST) provide exciting new opportunities to augment our fundamental understanding of these processes in health and disease. The large and complex datasets resulting from these techniques, particularly ST, have led to rapid development of innovative machine learning (ML) tools primarily based on deep learning techniques. These ML tools are now increasingly featured in integrated experimental and computational workflows to disentangle signals from noise in complex biological systems. However, it can be difficult to understand and balance the different implicit assumptions and methodologies of a rapidly expanding toolbox of analytical tools in ST. To address this, we summarize major ST analysis goals that ML can help address and current analysis trends. We also describe four major data science concepts and related heuristics that can help guide practitioners in their choices of the right tools for the right biological questions.

[1]  Hongkui Zeng,et al.  Unsupervised pattern discovery in spatial gene expression atlas reveals mouse brain regions beyond established ontology , 2023, bioRxiv.

[2]  Jian Ma,et al.  SpiceMix enables integrative single-cell spatial modeling of cell identity , 2023, Nature Genetics.

[3]  F. W. Townes,et al.  Nonnegative spatial factorization applied to spatial genomics , 2022, Nature Methods.

[4]  Shuai Wang,et al.  Spateo: multidimensional spatiotemporal modeling of single-cell spatial transcriptomics , 2022, bioRxiv.

[5]  Caroline Uhler,et al.  Graph-based autoencoder integrates spatial transcriptomics with chromatin images and identifies joint biomarkers for Alzheimer’s disease , 2022, Nature Communications.

[6]  Zhandong Liu,et al.  Region-specific denoising identifies spatial co-expression patterns and intra-tissue heterogeneity in spatially resolved transcriptomics data , 2022, Nature Communications.

[7]  Fabian J Theis,et al.  Modeling intercellular communication in tissues using spatial graphs of cells , 2022, Nature Biotechnology.

[8]  Pingping Wang,et al.  DeepST: identifying spatial domains in spatial transcriptomics by deep learning , 2022, Nucleic acids research.

[9]  Evan Z. Macosko,et al.  The expanding vistas of spatial transcriptomics , 2022, Nature Biotechnology.

[10]  M. Plikus,et al.  Screening cell–cell communication in spatial transcriptomics via collective optimal transport , 2022, bioRxiv.

[11]  Shidan Wang,et al.  Sprod for de-noising spatially resolved transcriptomics data based on position and image information , 2022, Nature Methods.

[12]  Hongyi Xin,et al.  TIST: Transcriptome and Histopathological Image Integrative Analysis for Spatial Transcriptomics , 2022, bioRxiv.

[13]  Xuerui Yang,et al.  De novo reconstruction of cell interaction landscapes from single-cell spatial transcriptome data with DeepLinc , 2022, Genome Biology.

[14]  Yuxiang Li,et al.  Spatial-ID: a cell typing method for spatially resolved transcriptomics via transfer learning and spatial embedding , 2022, bioRxiv.

[15]  Alexandro E. Trevino,et al.  SPACE-GM: geometric deep learning of disease-associated microenvironments from multiplex spatial protein profiles , 2022, bioRxiv.

[16]  R. Giugno,et al.  Stardust: improving spatial transcriptomics data analysis through space-aware modularity optimization-based clustering , 2022, bioRxiv.

[17]  Yingxin Lin,et al.  3D reconstruction of spatial expression , 2022, Nature Methods.

[18]  Huanming Yang,et al.  The single-cell stereo-seq reveals region-specific cell subtypes and transcriptome profiling in Arabidopsis leaves. , 2022, Developmental cell.

[19]  Michael I. Jordan,et al.  DestVI identifies continuums of cell types in spatial transcriptomics data , 2022, Nature Biotechnology.

[20]  J. Sáez-Rodríguez,et al.  Explainable multiview framework for dissecting spatial relationships from highly multiplexed data , 2022, Genome Biology.

[21]  Lani F. Wu,et al.  Integrative spatial analysis of cell morphologies and transcriptional states with MUSE , 2022, Nature Biotechnology.

[22]  Byung-Woo Hong,et al.  Vesalius: high‐resolution in silico anatomization of spatial transcriptomic data using image analysis , 2021, bioRxiv.

[23]  M. Gerstung,et al.  Cell2location maps fine-grained cell types in spatial transcriptomics , 2022, Nature Biotechnology.

[24]  F. W. Townes,et al.  Alignment of spatial genomics and histology data using deep Gaussian processes , 2022, bioRxiv.

[25]  Evan Z. Macosko,et al.  Cell type-specific inference of differential expression in spatial transcriptomics , 2021, Nature Methods.

[26]  Z. Bar-Joseph,et al.  Clustering spatial transcriptomics data , 2021, Bioinform..

[27]  Shihua Zhang,et al.  Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder , 2021, Nature Communications.

[28]  Dongjun Chung,et al.  Define and visualize pathological architectures of human tissues from spatially resolved transcriptomics using deep learning , 2021, bioRxiv.

[29]  Xiaoping Zhou,et al.  SpaceX: Gene Co-expression Network Estimation for Spatial Transcriptomics , 2021, bioRxiv.

[30]  Koseki J. Kobayashi-Kirschvink,et al.  Raman2RNA: Live-cell label-free prediction of single-cell RNA expression profiles by Raman microscopy , 2021, bioRxiv.

[31]  G. Nolan,et al.  Annotation of Spatially Resolved Single-cell Data with STELLAR , 2021, bioRxiv.

[32]  Evan Z. Macosko,et al.  Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram , 2021, Nature Methods.

[33]  Mingyao Li,et al.  SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network , 2021, Nature Methods.

[34]  O. Stegle,et al.  SpatialDE2: Fast and localized variance component analysis of spatial transcriptomics , 2021, bioRxiv.

[35]  P. Kharchenko,et al.  Cell segmentation in imaging-based spatial transcriptomics , 2021, Nature Biotechnology.

[36]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[37]  Raphael Gottardo,et al.  Spatial transcriptomics at subspot resolution with BayesSpace , 2021, Nature Biotechnology.

[38]  R. Wollman,et al.  Joint cell segmentation and cell type annotation for spatial transcriptomics , 2020, bioRxiv.

[39]  D. S. Lee,et al.  CellDART: cell type inference by domain adaptation of single-cell and spatial transcriptomic data , 2021, bioRxiv.

[40]  Hunter M Nisonoff,et al.  XYZeq: Spatially resolved single-cell RNA sequencing reveals expression heterogeneity in the tumor microenvironment , 2021, Science Advances.

[41]  Guocheng Yuan,et al.  SpatialDWLS: accurate deconvolution of spatial transcriptomic data , 2021, Genome Biology.

[42]  V. Marx Method of the Year: spatially resolved transcriptomics , 2021, Nature Methods.

[43]  Karsten M. Borgwardt,et al.  Biological network analysis with deep learning , 2020, Briefings Bioinform..

[44]  Lihua Zhang,et al.  Inference and analysis of cell-cell communication using CellChat , 2020, Nature Communications.

[45]  Rafael A. Irizarry,et al.  Robust decomposition of cell type mixtures in spatial transcriptomics , 2020, Nature Biotechnology.

[46]  Xiangxiang Zeng,et al.  Application of deep learning methods in biological networks , 2020, Briefings Bioinform..

[47]  Joseph Bergenstråhle,et al.  Super-resolved spatial transcriptomics by deep data fusion , 2020, Nature Biotechnology.

[48]  J. Kleinman,et al.  Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex , 2020, Nature Neuroscience.

[49]  Mufti Mahmud,et al.  Deep Learning in Mining Biological Data , 2020, Cognitive Computation.

[50]  Guocheng Yuan,et al.  Giotto, a toolbox for integrative analysis and visualization of spatial expression data , 2020 .

[51]  Z. Bar-Joseph,et al.  GCNG: graph convolutional networks for inferring gene interaction from spatial transcriptomics data , 2020, Genome biology.

[52]  Mingyao Li,et al.  Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network , 2020, bioRxiv.

[53]  Z. Bar-Joseph,et al.  GCNG: graph convolutional networks for inferring gene interaction from spatial transcriptomics data , 2020, Genome Biology.

[54]  Viktor Petukhov,et al.  Bayesian segmentation of spatially resolved transcriptomics data , 2020, bioRxiv.

[55]  Ziv Bar-Joseph,et al.  Identifying signaling genes in spatial single cell expression data , 2020, bioRxiv.

[56]  Zemin Zhang,et al.  Reconstruction of cell spatial organization from single-cell RNA sequencing data based on ligand-receptor mediated self-assembly , 2020, Cell Research.

[57]  Christina B. Azodi,et al.  Opening the Black Box: Interpretable Machine Learning for Geneticists. , 2020, Trends in genetics : TIG.

[58]  Q. Nguyen,et al.  stLearn: integrating spatial location, tissue morphology and gene expression to find cell types, cell-cell interactions and spatial trajectories within undissociated tissues , 2020, bioRxiv.

[59]  Q. Nie,et al.  Inferring spatial and signaling relationships between cells from single cell transcriptomic data , 2020, Nature Communications.

[60]  Mirjana Efremova,et al.  CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes , 2020, Nature Protocols.

[61]  Itai Yanai,et al.  Integrating microarray-based spatial transcriptomics and single-cell RNA-seq reveals tissue architecture in pancreatic ductal adenocarcinomas , 2020, Nature Biotechnology.

[62]  Xiang Zhou,et al.  Statistical analysis of spatial expression patterns for spatially resolved transcriptomic studies , 2019, Nature Methods.

[63]  Bin Yu Veridical data science , 2019, Proceedings of the National Academy of Sciences.

[64]  J. Beechem,et al.  GeoMx™ RNA Assay: High Multiplex, Digital, Spatial Analysis of RNA in FFPE Tissue. , 2020, Methods in molecular biology.

[65]  Y. Saeys,et al.  NicheNet: modeling intercellular communication by linking ligands to target genes , 2019, Nature Methods.

[66]  Quan Nguyen,et al.  SpaCell: integrating tissue morphology and spatial gene expression to predict disease cells , 2019, bioRxiv.

[67]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[68]  William Graf,et al.  Deep learning for cellular image analysis , 2019, Nature Methods.

[69]  Evan Z. Macosko,et al.  Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution , 2019, Science.

[70]  Ke Zhang,et al.  Identification of spatially variable genes with graph cuts , 2018, bioRxiv.

[71]  Diogo M. Camacho,et al.  Next-Generation Machine Learning for Biological Networks , 2018, Cell.

[72]  Mats Nilsson,et al.  Network Visualization and Analysis of Spatially Aware Gene Expression Data with InsituNet. , 2018, Cell systems.

[73]  O. Stegle,et al.  Modeling Cell-Cell Interactions from Spatial Molecular Data with Spatial Variance Component Analysis , 2018, bioRxiv.

[74]  Rickard Sandberg,et al.  Identification of spatial expression trends in single-cell gene expression data , 2018, Nature Methods.

[75]  Amir Hussain,et al.  Applications of Deep Learning and Reinforcement Learning to Biological Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[76]  James B. Brown,et al.  Iterative random forests to discover predictive and stable high-order interactions , 2017, Proceedings of the National Academy of Sciences.

[77]  Johanna Hardin,et al.  Selecting between‐sample RNA‐Seq normalization methods from the perspective of their assumptions , 2016, Briefings Bioinform..

[78]  Fabian J Theis,et al.  SCANPY: large-scale single-cell gene expression data analysis , 2018, Genome Biology.

[79]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Deep learning for biological image classification , 2017, Expert Syst. Appl..

[80]  Kristofer E. Bouchard,et al.  Union of Intersections ( UoI ) for interpretable data driven discovery and prediction in neuroscience , 2018 .

[81]  Fabian J Theis,et al.  Prospective identification of hematopoietic lineage choice by deep learning , 2017, Nature Methods.

[82]  O. Stegle,et al.  Deep learning for computational biology , 2016, Molecular systems biology.

[83]  J. Zyprych-Walczak,et al.  The Impact of Normalization Methods on RNA-Seq Data Analysis , 2015, BioMed research international.

[84]  X. Zhuang,et al.  Spatially resolved, highly multiplexed RNA profiling in single cells , 2015, Science.

[85]  Cole Trapnell,et al.  Computational methods for transcriptome annotation and quantification using RNA-seq , 2011, Nature Methods.

[86]  M. Robinson,et al.  A scaling normalization method for differential expression analysis of RNA-seq data , 2010, Genome Biology.

[87]  Sandrine Dudoit,et al.  Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments , 2010, BMC Bioinformatics.

[88]  Allan R. Jones,et al.  An anatomic gene expression atlas of the adult mouse brain , 2009, Nature Neuroscience.

[89]  Scott A. Rifkin,et al.  Imaging individual mRNA molecules using multiple singly labeled probes , 2008, Nature Methods.

[90]  F S Fay,et al.  Visualization of single RNA transcripts in situ. , 1998, Science.

[91]  T. Graham Faculty Opinions recommendation of CellPhoneDB: inferring cell-cell communication from combined expression of multi-subunit ligand-receptor complexes. , 2022, Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature.