Visualizing the spatial gene expression organization in the brain through non-linear similarity embeddings.

The Allen Brain Atlases enable the study of spatially resolved, genome-wide gene expression patterns across the mammalian brain. Several explorative studies have applied linear dimensionality reduction methods such as Principal Component Analysis (PCA) and classical Multi-Dimensional Scaling (cMDS) to gain insight into the spatial organization of these expression patterns. In this paper, we describe a non-linear embedding technique called Barnes-Hut Stochastic Neighbor Embedding (BH-SNE) that emphasizes the local similarity structure of high-dimensional data points. By applying BH-SNE to the gene expression data from the Allen Brain Atlases, we demonstrate the consistency of the 2D, non-linear embedding of the sagittal and coronal mouse brain atlases, and across 6 human brains. In addition, we quantitatively show that BH-SNE maps are superior in their separation of neuroanatomical regions in comparison to PCA and cMDS. Finally, we assess the effect of higher-order principal components on the global structure of the BH-SNE similarity maps. Based on our observations, we conclude that BH-SNE maps with or without prior dimensionality reduction (based on PCA) provide comprehensive and intuitive insights in both the local and global spatial transcriptome structure of the human and mouse Allen Brain Atlases.

[1]  Trevor F. Cox,et al.  Metric multidimensional scaling , 2000 .

[2]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[3]  Y. Xing,et al.  A Transcriptome Database for Astrocytes, Neurons, and Oligodendrocytes: A New Resource for Understanding Brain Development and Function , 2008, The Journal of Neuroscience.

[4]  Julie Moss,et al.  EMAGE mouse embryo spatial gene expression database: 2014 update , 2013, Nucleic Acids Res..

[5]  Jarkko Venna,et al.  Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization , 2010, J. Mach. Learn. Res..

[6]  Leslie Greengard,et al.  A fast algorithm for particle simulations , 1987 .

[7]  Gaurav Sharma Digital Color Imaging Handbook , 2002 .

[8]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[9]  Luis Pizarro,et al.  Hyperspectral visualization of mass spectrometry imaging data. , 2013, Analytical chemistry.

[10]  Allan R. Jones,et al.  Genome-wide atlas of gene expression in the adult mouse brain , 2007, Nature.

[11]  Allan R. Jones,et al.  Neuroinformatics for Genome-Wide 3-D Gene Expression Mapping in the Mouse Brain , 2007, TCBB.

[12]  Alan C. Evans,et al.  Enhancement of MR Images Using Registration for Signal Averaging , 1998, Journal of Computer Assisted Tomography.

[13]  Allan R. Jones,et al.  An anatomically comprehensive atlas of the adult human brain transcriptome , 2012, Nature.

[14]  James A. Eddy,et al.  Cell type-specific genes show striking and distinct patterns of spatial expression in the mouse brain , 2013, Proceedings of the National Academy of Sciences.

[15]  Allan R. Jones,et al.  An anatomic gene expression atlas of the adult mouse brain , 2009, Nature Neuroscience.

[16]  Christopher K. I. Williams On a Connection between Kernel PCA and Metric Multidimensional Scaling , 2004, Machine Learning.

[18]  Miguel Á. Carreira-Perpiñán,et al.  Linear-time training of nonlinear low-dimensional embeddings , 2014, AISTATS.

[19]  Lydia Ng,et al.  Exploration and visualization of gene expression with neuroanatomy in the adult mouse brain , 2008, BMC Bioinformatics.

[20]  John Morris,et al.  Multi-scale correlation structure of gene expression in the brain , 2011, Neural Networks.

[21]  Piet Hut,et al.  A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.

[22]  Juan Carlos Fernández,et al.  Multiobjective evolutionary algorithms to identify highly autocorrelated areas: the case of spatial distribution in financially compromised farms , 2014, Ann. Oper. Res..

[23]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[24]  Mark M. Davis,et al.  Automatic Classification of Cellular Expression by Nonlinear Stochastic Embedding (ACCENSE) , 2013, Proceedings of the National Academy of Sciences.

[25]  H. J. Mclaughlin,et al.  Learn , 2002 .

[26]  Melba M. Crawford,et al.  Manifold-Learning-Based Feature Extraction for Classification of Hyperspectral Data: A Review of Advances in Manifold Learning , 2014, IEEE Signal Processing Magazine.

[27]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[28]  L. Swanson Brain Architecture: Understanding the Basic Plan , 2002 .

[29]  Shuiwang Ji Computational genetic neuroanatomy of the developing mouse brain: dimensionality reduction, visualization, and clustering , 2013, BMC Bioinformatics.

[30]  Lydia Ng,et al.  Clustering of spatial gene expression patterns in the mouse brain and comparison with classical neuroanatomy. , 2010, Methods.