Integrative analysis of the connectivity and gene expression atlases in the mouse brain

Brain function is the result of interneuron signal transmission controlled by the fundamental biochemistry of each neuron. The biochemical content of a neuron is in turn determined by spatiotemporal gene expression and regulation encoded into the genomic regulatory networks. It is thus of particular interest to elucidate the relationship between gene expression patterns and connectivity in the brain. However, systematic studies of this relationship in a single mammalian brain are lacking to date. Here, we investigate this relationship in the mouse brain using the Allen Brain Atlas data. We employ computational models for predicting brain connectivity from gene expression data. In addition to giving competitive predictive performance, these models can rank the genes according to their predictive power. We show that gene expression is predictive of connectivity in the mouse brain when the connectivity signals are discretized. When the expression patterns of 4084 genes are used, we obtain a predictive accuracy of 93%. Our results also show that a small number of genes can almost give the full predictive power of using thousands of genes. We can achieve a prediction accuracy of 91% by using only 25 genes. Gene ontology analysis of the highly ranked genes shows that they are enriched for connectivity related processes.

[1]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[2]  Allan R. Jones,et al.  Genomic Anatomy of the Hippocampus , 2008, Neuron.

[3]  D. Goldowitz Allen Reference Atlas. A Digital Color Brain Atlas of the C57BL/6J Male Mouse - by H. W. Dong , 2010 .

[4]  Karl J. Friston,et al.  Reduced frontotemporal functional connectivity in schizophrenia associated with auditory hallucinations , 2002, Biological Psychiatry.

[5]  Matthew de Brecht,et al.  Combining sparseness and smoothness improves classification accuracy and interpretability , 2012, NeuroImage.

[6]  J. Sanes,et al.  Ome sweet ome: what can the genome tell us about the connectome? , 2008, Current Opinion in Neurobiology.

[7]  Leon French,et al.  Relationships between Gene Expression and Brain Wiring in the Adult Rodent Brain , 2011, PLoS Comput. Biol..

[8]  Chia-Hua Ho,et al.  Recent Advances of Large-Scale Linear Classification , 2012, Proceedings of the IEEE.

[9]  Larry W. Swanson,et al.  Brain Maps: Structure of the Rat Brain , 1992 .

[10]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[11]  Antonio Criminisi,et al.  Decision Forests for Computer Vision and Medical Image Analysis , 2013, Advances in Computer Vision and Pattern Recognition.

[12]  Roded Sharan,et al.  Gene Expression in the Rodent Brain is Associated with Its Regional Connectivity , 2011, PLoS Comput. Biol..

[13]  Daoqiang Zhang,et al.  Ensemble sparse classification of Alzheimer's disease , 2012, NeuroImage.

[14]  Paul M. Thompson,et al.  Multi-source feature learning for joint analysis of incomplete multiple heterogeneous neuroimaging data , 2012, NeuroImage.

[15]  L. Swanson Brain Architecture: Understanding the Basic Plan , 2002 .

[16]  Eran Segal,et al.  Using Expression Profiles of Caenorhabditis elegans Neurons To Identify Genes That Mediate Synaptic Connectivity , 2008, PLoS Comput. Biol..

[17]  Constantin F. Aliferis,et al.  A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification , 2008, BMC Bioinformatics.

[18]  Allan R. Jones,et al.  Neurogenomics: at the intersection of neurobiology and genome sciences , 2004, Nature Neuroscience.

[19]  H. Markram,et al.  Correlation maps allow neuronal electrical properties to be predicted from single-cell gene expression profiles in rat neocortex. , 2004, Cerebral cortex.

[20]  Y. Xing,et al.  A Transcriptome Database for Astrocytes, Neurons, and Oligodendrocytes: A New Resource for Understanding Brain Development and Function , 2008, The Journal of Neuroscience.

[21]  Charles Watson,et al.  The Brain: An Introduction to Functional Neuroanatomy , 2010 .

[22]  Leon French,et al.  Large-Scale Analysis of Gene Expression and Connectivity in the Rodent Brain: Insights through Data Integration , 2011, Front. Neuroinform..

[23]  Isaac Meilijson,et al.  Gene Expression of Caenorhabditis elegans Neurons Carries Information on Their Synaptic Connectivity , 2006, PLoS Comput. Biol..

[24]  M. Just,et al.  Functional and anatomical cortical underconnectivity in autism: evidence from an FMRI study of an executive function task and corpus callosum morphometry. , 2007, Cerebral cortex.

[25]  Shuiwang Ji,et al.  Computational network analysis of the anatomical and genetic organizations in the mouse brain , 2011, Bioinform..

[26]  Daniel Rueckert,et al.  Random forest-based similarity measures for multi-modal classification of Alzheimer's disease , 2013, NeuroImage.

[27]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[28]  Chih-Jen Lin,et al.  A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification , 2010, J. Mach. Learn. Res..

[29]  M. Fortin,et al.  Spatial pattern and ecological analysis , 1989, Vegetatio.

[30]  Bruce R. Johnson,et al.  BOOK REVIEW Brain Architecture: Understanding the Basic Plan, 2nd Edition , 2013 .

[31]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[32]  Larry W Swanson,et al.  From gene networks to brain networks , 2003, Nature Neuroscience.

[33]  Kaustubh Supekar,et al.  Sparse logistic regression for whole-brain classification of fMRI data , 2010, NeuroImage.

[34]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[35]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[36]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[37]  Martin A. Lindquist,et al.  Dynamic connectivity regression: Determining state-related changes in brain connectivity , 2012, NeuroImage.

[38]  Peter Bühlmann,et al.  Bagging, Boosting and Ensemble Methods , 2012 .

[39]  Charles Watson,et al.  The Mouse Nervous System. , 2012 .

[40]  Brian B. Avants,et al.  Neuroinformatics for Genome-Wide 3-D Gene Expression Mapping in the Mouse Brain , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[41]  Yali Amit,et al.  Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[42]  Hong Wei Dong,et al.  Allen reference atlas : a digital color brain atlas of the C57Black/6J male mouse , 2008 .

[43]  Paul Tseng,et al.  Trace Norm Regularization: Reformulations, Algorithms, and Multi-Task Learning , 2010, SIAM J. Optim..

[44]  Tao Ju,et al.  A Digital Atlas to Characterize the Mouse Brain Transcriptome , 2005, PLoS Comput. Biol..

[45]  Allan R. Jones,et al.  Genome-wide atlas of gene expression in the adult mouse brain , 2007, Nature.

[46]  J. Price :Allen Reference Atlas: A Digital Color Brain Atlas of the C57BL/6J Male Mouse , 2008 .

[47]  Mandy Eberhart,et al.  Decision Forests For Computer Vision And Medical Image Analysis , 2016 .

[48]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[49]  Lydia Ng,et al.  Allen Brain Atlas: an integrated spatio-temporal portal for exploring the central nervous system , 2012, Nucleic Acids Res..

[50]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[51]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[52]  Larry W. Swanson,et al.  BAMS Neuroanatomical Ontology: Design and Implementation , 2008, Frontiers Neuroinformatics.

[53]  Jieping Ye,et al.  Large-scale sparse logistic regression , 2009, KDD.

[54]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[55]  R. Sokal,et al.  Multiple regression and correlation extensions of the mantel test of matrix correspondence , 1986 .

[56]  David M. Miller,et al.  Computational inference of the molecular logic for synaptic connectivity in C. elegans , 2006, ISMB.

[57]  Jieping Ye,et al.  Sparse learning and stability selection for predicting MCI to AD conversion using baseline ADNI data , 2012, BMC Neurology.

[58]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[59]  Wei Xie,et al.  RFECS: A Random-Forest Based Algorithm for Enhancer Identification from Chromatin State , 2013, PLoS Comput. Biol..

[60]  Jagath C. Rajapakse,et al.  Learning functional structure from fMR images , 2006, NeuroImage.

[61]  Chih-Jen Lin,et al.  Trust Region Newton Method for Logistic Regression , 2008, J. Mach. Learn. Res..

[62]  Shuiwang Ji,et al.  SLEP: Sparse Learning with Efficient Projections , 2011 .

[63]  Olivier Clatz,et al.  Spatial decision forests for MS lesion segmentation in multi-channel magnetic resonance images , 2011, NeuroImage.

[64]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[65]  D. Geschwind,et al.  Autism spectrum disorders: developmental disconnection syndromes , 2007, Current Opinion in Neurobiology.

[66]  Thomas Lengauer,et al.  Classification with correlated features: unreliability of feature ranking and solutions , 2011, Bioinform..

[67]  Thomas Lengauer,et al.  Permutation importance: a corrected feature importance measure , 2010, Bioinform..

[68]  Daoqiang Zhang,et al.  Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease , 2012, NeuroImage.

[69]  Arthur W. Toga,et al.  Genomic–anatomic evidence for distinct functional domains in hippocampal field CA1 , 2009, Proceedings of the National Academy of Sciences.