Statistical analysis of spatial expression patterns for spatially resolved transcriptomic studies

Identifying genes that display spatial expression patterns in spatially resolved transcriptomic studies is an important first step toward characterizing the spatial transcriptomic landscape of complex tissues. Here we present a statistical method, SPARK, for identifying spatial expression patterns of genes in data generated from various spatially resolved transcriptomic techniques. SPARK directly models spatial count data through generalized linear spatial models. It relies on recently developed statistical formulas for hypothesis testing, providing effective control of type I errors and yielding high statistical power. With a computationally efficient algorithm, which is based on penalized quasi-likelihood, SPARK is also scalable to datasets with tens of thousands of genes measured on tens of thousands of samples. Analyzing four published spatially resolved transcriptomic datasets using SPARK, we show it can be up to ten times more powerful than existing methods and disclose biological discoveries that otherwise cannot be revealed by existing approaches. A statistical method called SPARK for analyzing spatially resolved transcriptomic data can efficiently identify spatially expressed genes with effective control of type I errors and high statistical power.

[1]  F. E. Satterthwaite An approximate distribution of estimates of variance components. , 1946, Biometrics.

[2]  A. Oudenaarden,et al.  Genome-wide RNA Tomography in the Zebrafish Embryo , 2014, Cell.

[3]  Xihong Lin,et al.  ACAT: A Fast and Powerful p Value Combination Method for Rare-Variant Analysis in Sequencing Studies. , 2019, American journal of human genetics.

[4]  Xiao-Li Meng,et al.  An unexpected encounter with Cauchy and L\'evy , 2015 .

[5]  William E. Allen,et al.  Three-dimensional intact-tissue sequencing of single-cell transcriptional states , 2018, Science.

[6]  L. Goodrich,et al.  Sensory Neuron Diversity in the Inner Ear Is Shaped by Activity , 2018, Cell.

[7]  A. Schally,et al.  Extrapituitary effects of the growth hormone-releasing hormone. , 2005, Vitamins and hormones.

[8]  Elisabeth B Binder,et al.  FoxO1, A2M, and TGF-β1: three novel genes predicting depression in gene X environment interactions are identified using cross-species and cross-tissues transcriptomic and miRNomic analyses , 2018, Molecular Psychiatry.

[9]  B. Zimmermann,et al.  Detecting spatial structures in throughfall data: The effect of extent, sample size, sampling design, and variogram estimation method , 2016 .

[10]  D. Absher,et al.  A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data , 2015, bioRxiv.

[11]  N. Breslow,et al.  Bias correction in generalised linear mixed models with a single component of dispersion , 1995 .

[12]  R. Webster,et al.  Optimal interpolation and isarithmic mapping of soil properties: I The semi‐variogram and punctual kriging , 1980, European Journal of Soil Science.

[13]  Aaron Lun,et al.  Overcoming systematic errors caused by log-transformation of normalized single-cell RNA sequencing data , 2018, bioRxiv.

[14]  Andrew D. Rouillard,et al.  The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins , 2016, Database J. Biol. Databases Curation.

[15]  François Rousset,et al.  Testing environmental and genetic effects in the presence of spatial autocorrelation , 2014 .

[16]  Rickard Sandberg,et al.  Identification of spatial expression trends in single-cell gene expression data , 2018, Nature Methods.

[17]  N. Breslow,et al.  Bias Correction in Generalized Linear Mixed Models with Multiple Components of Dispersion , 1996 .

[18]  Steven J. M. Jones,et al.  CancerMine: a literature-mined resource for drivers, oncogenes and tumor suppressors in cancer , 2018, Nature Methods.

[19]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[20]  F. Tang,et al.  Spatial transcriptomic survey of human embryonic cerebral cortex by single-cell RNA-seq analysis , 2018, Cell Research.

[21]  Marten P Smidt,et al.  Spatial and temporal expression of FoxO transcription factors in the developing and adult murine brain. , 2006, Gene expression patterns : GEP.

[22]  P. Diggle,et al.  Model‐based geostatistics , 2007 .

[23]  Hong S. He,et al.  Periodicity in spatial data and geostatistical models: autocorrelation between patches , 2000 .

[24]  Lawrence M. Lifshitz,et al.  Visualization of single molecules of mRNA in situ. , 2003, Methods in enzymology.

[25]  Carolina Wählby,et al.  In situ sequencing for RNA analysis in preserved tissue and cells , 2013, Nature Methods.

[26]  John A Wolf,et al.  Transcriptome In Vivo Analysis (TIVA) of spatially defined single cells in intact live mouse and human brain tissue , 2014, Nature Methods.

[27]  Mhamed-Ali El-Aroui,et al.  Generalized linear spatial models in epidemiology: A case study of zoonotic cutaneous leishmaniasis in Tunisia , 2010 .

[28]  Burak Tepe,et al.  Single-Cell RNA-Seq of Mouse Olfactory Bulb Reveals Cellular Heterogeneity and Activity-Dependent Molecular Census of Adult-Born Neurons , 2018, Cell reports.

[29]  Craig F. Ferris,et al.  Synthesis and evaluation of potent and selective human V1a receptor antagonists as potential ligands for PET or SPECT imaging. , 2012, Bioorganic & medicinal chemistry.

[30]  Guangchuang Yu,et al.  clusterProfiler: an R package for comparing biological themes among gene clusters. , 2012, Omics : a journal of integrative biology.

[31]  Xiang Zhou,et al.  Heritability Estimation and Differential Analysis with Generalized Linear Mixed Models in Genomic Sequencing Studies , 2018, bioRxiv.

[32]  Xiang Zhou,et al.  Differential expression analysis for RNAseq using Poisson mixed models , 2016, bioRxiv.

[33]  Catherine A. Calder,et al.  Beyond Moran's I: Testing for Spatial Dependence Based on the Spatial Autoregressive Model , 2007 .

[34]  J. Burbach,et al.  Rat oxytocin receptor in brain, pituitary, mammary gland, and uterus: partial sequence and immunocytochemical localization. , 1995, Endocrinology.

[35]  Christof Koch,et al.  Adult Mouse Cortical Cell Taxonomy by Single Cell Transcriptomics , 2016, Nature Neuroscience.

[36]  L. Cai,et al.  In Situ Transcription Profiling of Single Cells Reveals Spatial Organization of Cells in the Mouse Hippocampus , 2016, Neuron.

[37]  Alexander van Oudenaarden,et al.  Spatially resolved transcriptomics and beyond , 2014, Nature Reviews Genetics.

[38]  S. Teichmann,et al.  SpatialDE: identification of spatially variable genes , 2018, Nature Methods.

[39]  Nimrod D. Rubinstein,et al.  Molecular, spatial, and functional single-cell profiling of the hypothalamic preoptic region , 2018, Science.

[40]  X. Zhuang,et al.  Spatially resolved, highly multiplexed RNA profiling in single cells , 2015, Science.

[41]  Patrik L. Ståhl,et al.  Visualization and analysis of gene expression in tissue sections by spatial transcriptomics , 2016, Science.

[42]  M. Ilyas Kamboh,et al.  Genetic variation in the choline acetyltransferase (CHAT) gene may be associated with the risk of Alzheimer's disease , 2006, Neurobiology of Aging.

[43]  Xihong Lin,et al.  Spatial Linear Mixed Models with Covariate Measurement Errors. , 2009, Statistica Sinica.

[44]  R. Waagepetersen,et al.  Bayesian Prediction of Spatial Count Data Using Generalized Linear Mixed Models , 2002, Biometrics.

[45]  Eric E. Turner,et al.  A central role for Islet1 in sensory neuron development linking sensory and spinal gene regulatory programs , 2008, Nature Neuroscience.

[46]  L. Liotta,et al.  Laser-capture microdissection: opening the microscopic frontier to molecular analysis. , 1998, Trends in genetics : TIG.

[47]  Satterthwaite Fe An approximate distribution of estimates of variance components. , 1946 .

[48]  Timur Zhiyentayev,et al.  Single-cell in situ RNA profiling by sequential hybridization , 2014, Nature Methods.

[49]  Evan Z. Macosko,et al.  Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution , 2019, Science.

[50]  J. Vanhatalo,et al.  Approximate inference for disease mapping with sparse Gaussian processes , 2010, Statistics in medicine.

[51]  George M. Church,et al.  Highly Multiplexed Subcellular RNA Sequencing in Situ , 2014, Science.