Towards a supervised classification of neocortical interneuron morphologies

BackgroundThe challenge of classifying cortical interneurons is yet to be solved. Data-driven classification into established morphological types may provide insight and practical value.ResultsWe trained models using 217 high-quality morphologies of rat somatosensory neocortex interneurons reconstructed by a single laboratory and pre-classified into eight types. We quantified 103 axonal and dendritic morphometrics, including novel ones that capture features such as arbor orientation, extent in layer one, and dendritic polarity. We trained a one-versus-rest classifier for each type, combining well-known supervised classification algorithms with feature selection and over- and under-sampling. We accurately classified the nest basket, Martinotti, and basket cell types with the Martinotti model outperforming 39 out of 42 leading neuroscientists. We had moderate accuracy for the double bouquet, small and large basket types, and limited accuracy for the chandelier and bitufted types. We characterized the types with interpretable models or with up to ten morphometrics.ConclusionExcept for large basket, 50 high-quality reconstructions sufficed to learn an accurate model of a type. Improving these models may require quantifying complex arborization patterns and finding correlates of bouton-related features. Our study brings attention to practical aspects important for neuron classification and is readily reproducible, with all code and data available online.

[1]  David G. Stork,et al.  Evaluating Classifiers by Means of Test Data with Noisy Labels , 2003, IJCAI.

[2]  R. Tremblay,et al.  GABAergic Interneurons in the Neocortex: From Cellular Properties to Circuits , 2016, Neuron.

[3]  Ruchi Parekh,et al.  Neuronal Morphology Goes Digital: A Research Hub for Cellular and System Neuroscience , 2013, Neuron.

[4]  Staci A. Sorensen,et al.  Adult Mouse Cortical Cell Taxonomy Revealed by Single Cell Transcriptomics , 2016 .

[5]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[6]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[7]  Xavier Vasques,et al.  Morphological Neuron Classification Using Machine Learning , 2016, Front. Neuroanat..

[8]  Sean L. Hill,et al.  BigNeuron: Large-Scale 3D Neuron Reconstruction from Optical Microscopy Images , 2015, Neuron.

[9]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[10]  Paola Zuccolotto,et al.  Variable Selection Using Random Forests , 2006 .

[11]  H. Markram,et al.  Anatomical, physiological, molecular and circuit properties of nest basket cells in the developing somatosensory cortex. , 2002, Cerebral cortex.

[12]  Ning Jiang,et al.  Our path to better science in less time using open data science tools , 2017, Nature Ecology &Evolution.

[13]  Concha Bielza,et al.  Data Publications Correlate with Citation Impact , 2016, Front. Neurosci..

[14]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[15]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .

[16]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[17]  H. Markram,et al.  Neuropeptide and calcium‐binding protein gene expression profiles predict neuronal anatomical type in the juvenile rat , 2005, The Journal of physiology.

[18]  C. R. Rao,et al.  The Utilization of Multiple Measurements in Problems of Biological Classification , 1948 .

[19]  G. Ascoli,et al.  NeuroMorpho.Org: A Central Resource for Neuronal Morphologies , 2007, The Journal of Neuroscience.

[20]  D. Jaeger Accurate Reconstruction of Neuronal Morphology , 2000 .

[21]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .

[22]  J. R. Hughes,et al.  Cerebral cortex. Vol. 1. Cellular components of the cerebral cortex , 1984 .

[23]  H. Markram,et al.  Interneurons of the neocortical inhibitory system , 2004, Nature Reviews Neuroscience.

[24]  B. Ripley,et al.  Recursive Partitioning and Regression Trees , 2015 .

[25]  Age K Smilde,et al.  A Critical Assessment of Feature Selection Methods for Biomarker Discovery in Clinical Proteomics* , 2012, Molecular & Cellular Proteomics.

[26]  P Sterling,et al.  Retinal neurons and vessels are not fractal but space‐filling , 1995, The Journal of comparative neurology.

[27]  Moritz Helmstaedter,et al.  The relation between dendritic geometry, electrical excitability, and axonal projections of L2/3 interneurons in rat barrel cortex. , 2009, Cerebral cortex.

[28]  R. C Cannon,et al.  An on-line archive of reconstructed hippocampal neurons , 1998, Journal of Neuroscience Methods.

[29]  Concha Bielza,et al.  Multi-dimensional classification of GABAergic interneurons with Bayesian network-modeled label uncertainty , 2014, Front. Comput. Neurosci..

[30]  Hongkui Zeng,et al.  Neuronal cell-type classification: challenges, opportunities and the path forward , 2017, Nature Reviews Neuroscience.

[31]  Darrel C. Ince,et al.  The case for open computer programs , 2012, Nature.

[32]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[33]  Y. Burnod,et al.  Principal component analysis: a suitable method for the 3-dimensional study of the shape, dimensions and orientation of dendritic arborizations , 1983, Journal of Neuroscience Methods.

[34]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[35]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[36]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[37]  H. Markram,et al.  Anatomical, physiological and molecular properties of Martinotti cells in the somatosensory cortex of the juvenile rat , 2004, The Journal of physiology.

[38]  Concha Bielza,et al.  New insights into the classification and nomenclature of cortical GABAergic interneurons , 2013, Nature Reviews Neuroscience.

[39]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[40]  Erik De Schutter,et al.  Computational neuroscience : realistic modeling for experimentalists , 2000 .

[41]  E. D’Angelo The human brain project. , 2012, Functional neurology.

[42]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[43]  R. Yuste,et al.  Comparison Between Supervised and Unsupervised Classifications of Neuronal Cell Types: A Case Study , 2010, Developmental neurobiology.

[44]  B. Sakmann,et al.  Neuronal correlates of local, lateral, and translaminar inhibition with reference to cortical columns. , 2009, Cerebral cortex.

[45]  Alexander S. Ecker,et al.  Principles of connectivity among morphologically defined cell types in adult neocortex , 2015, Science.

[46]  Carla E. Brodley,et al.  Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[47]  Anne-Laure Boulesteix,et al.  Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics , 2012, WIREs Data Mining Knowl. Discov..

[48]  Jaap van Pelt,et al.  Measures for quantifying dendritic arborizations , 2002, Network.

[49]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[50]  Javier DeFelipe,et al.  Cortical interneurons: from Cajal to 2001. , 2002, Progress in brain research.

[51]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[52]  J. van Pelt,et al.  Tree asymmetry--a sensitive and practical measure for binary topological trees. , 1992, Bulletin of mathematical biology.

[53]  G. Ascoli Mobilizing the base of neuroscience data: the case of neuronal morphologies , 2006, Nature Reviews Neuroscience.

[54]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[55]  Giorgio A. Ascoli,et al.  Statistical analysis and data mining of digital reconstructions of dendritic morphologies , 2014, Front. Neuroanat..

[56]  P. Somogyi,et al.  Salient features of synaptic organisation in the cerebral cortex 1 Published on the World Wide Web on 3 March 1998. 1 , 1998, Brain Research Reviews.

[57]  P. Somogyi A specific ‘axo-axonal’ interneuron in the visual cortex of the rat , 1977, Brain Research.

[58]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[59]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[60]  James G. King,et al.  Reconstruction and Simulation of Neocortical Microcircuitry , 2015, Cell.

[61]  George Lee,et al.  Evaluating feature selection strategies for high dimensional, small sample size datasets , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[62]  Klaus Hechenbichler,et al.  Weighted k-Nearest-Neighbor Techniques and Ordinal Classification , 2004 .

[63]  Carolin Strobl,et al.  An AUC-based permutation variable importance measure for random forests , 2013, BMC Bioinformatics.

[64]  Giorgio A. Ascoli,et al.  Towards the automatic classification of neurons , 2015, Trends in Neurosciences.

[65]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[66]  E. P. Gardner,et al.  Petilla terminology: nomenclature of features of GABAergic interneurons of the cerebral cortex , 2008, Nature Reviews Neuroscience.

[67]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[68]  M. C. Angulo,et al.  Molecular and Physiological Diversity of Cortical Nonpyramidal Cells , 1997, The Journal of Neuroscience.

[69]  A. Zeileis,et al.  Danger: High Power! – Exploring the Statistical Properties of a Test for Random Forest Variable Importance , 2008 .

[70]  Y. Kubota,et al.  GABAergic cell subtypes and their synaptic connections in rat frontal cortex. , 1997, Cerebral cortex.

[71]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[72]  Xiang Chen,et al.  Maximal conditional chi-square importance in random forests , 2010, Bioinform..

[73]  Bernd Bischl,et al.  mlr: Machine Learning in R , 2016, J. Mach. Learn. Res..

[74]  R. Yuste,et al.  Correlation between axonal morphologies and synaptic input kinetics of interneurons from mouse visual cortex. , 2007, Cerebral cortex.

[75]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[76]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[77]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[78]  R. Yuste Origin and Classification of Neocortical Interneurons , 2005, Neuron.

[79]  E. White Cortical Circuits: Synaptic Organization of the Cerebral Cortex , 1989 .

[80]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[81]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[82]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[83]  James G. King,et al.  The neocortical microcircuit collaboration portal: a resource for rat somatosensory cortex , 2015, Front. Neural Circuits.

[84]  K. Hornik,et al.  Unbiased Recursive Partitioning: A Conditional Inference Framework , 2006 .

[85]  Maliha S. Nash,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.

[86]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[87]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[88]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[89]  Stefan Fritsch,et al.  neuralnet: Training of Neural Networks , 2010, R J..

[90]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[91]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[92]  Thomas Lengauer,et al.  Permutation importance: a corrected feature importance measure , 2010, Bioinform..

[93]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien [R package e1071 version 1.7-4] , 2020 .

[94]  Carolin Strobl,et al.  The behaviour of random forest permutation-based variable importance measures under predictor correlation , 2010, BMC Bioinformatics.

[95]  Sridevi Polavaram,et al.  Win–win data sharing in neuroscience , 2017, Nature Methods.

[96]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[97]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[98]  Xue-wen Chen,et al.  Combating the Small Sample Class Imbalance Problem Using Feature Selection , 2010, IEEE Transactions on Knowledge and Data Engineering.

[99]  J. van Pelt,et al.  Analysis of binary trees when occasional multifurcations can be considered as aggregates of bifurcations. , 1990, Bulletin of mathematical biology.

[100]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[101]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[102]  G. Ascoli,et al.  L-Measure: a web-accessible tool for the analysis, comparison and search of digital reconstructions of neuronal morphologies , 2008, Nature Protocols.

[103]  George Forman,et al.  Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement , 2010, SKDD.

[104]  J. DeFelipe,et al.  Neocortical neuronal diversity: chemical heterogeneity revealed by colocalization studies of classic neurotransmitters, neuropeptides, calcium-binding proteins, and cell surface molecules. , 1993, Cerebral cortex.

[105]  D. Feldmeyer,et al.  Inhibitory Interneurons and their Circuit Motifs in the Many Layers of the Barrel Cortex , 2018, Neuroscience.

[106]  G. Ascoli,et al.  Quantitative morphometry of hippocampal pyramidal cells: Differences between anatomical classes and reconstructing laboratories , 2004, The Journal of comparative neurology.

[107]  J. Kong,et al.  Diversity of ganglion cells in the mouse retina: Unsupervised morphological classification and its limits , 2005, The Journal of comparative neurology.

[108]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[109]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[110]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .