Modern machine learning outperforms GLMs at predicting spikes

Neuroscience has long focused on finding encoding models that effectively ask “what predicts neural spiking?” and generalized linear models (GLMs) are a typical approach. Modern machine learning techniques have the potential to perform better. Here we directly compared GLMs to three leading methods: feedforward neural networks, gradient boosted trees, and stacked ensembles that combine the predictions of several methods. We predicted spike counts in macaque motor (M1) and somatosensory (S1) cortices from reaching kinematics, and in rat hippocampal cells from open field location and orientation. In general, the modern methods produced far better spike predictions and were less sensitive to the preprocessing of features. XGBoost and the ensemble were the best-performing methods and worked well even on neural data with very low spike rates. This overall performance suggests that tuning curves built with GLMs are at times inaccurate and can be easily improved upon. Our publicly shared code uses standard packages and can be quickly applied to other datasets. Encoding models built with machine learning techniques more accurately predict spikes and can offer meaningful benchmarks for simpler models.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[3]  E J Chichilnisky,et al.  Prediction and Decoding of Retinal Ganglion Cell Responses with a Probabilistic Spiking Model , 2005, The Journal of Neuroscience.

[4]  B. McNaughton,et al.  The contributions of position, direction, and velocity to single unit activity in the hippocampus of freely-moving rats , 1983, Experimental Brain Research.

[5]  Martin J. Wainwright,et al.  Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions , 2011, ICML.

[6]  Aurélien Garivier,et al.  On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..

[7]  Matthias Bethge,et al.  Bayesian Inference for Generalized Linear Models for Spiking Neurons , 2010, Front. Comput. Neurosci..

[8]  Achim Zeileis,et al.  Conditional variable importance for random forests , 2008, BMC Bioinformatics.

[9]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[10]  Ryan J. Prenger,et al.  Nonlinear V1 responses to natural scenes revealed by neural network analysis , 2004, Neural Networks.

[11]  V Kishore Ayyadevara,et al.  Gradient Boosting Machine , 2018 .

[12]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[13]  L. Paninski,et al.  Spatiotemporal tuning of motor cortical neurons for hand position and velocity. , 2004, Journal of neurophysiology.

[14]  D. McFadden,et al.  URBAN TRAVEL DEMAND - A BEHAVIORAL ANALYSIS , 1977 .

[15]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[16]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[17]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[18]  Adrienne L. Fairhall,et al.  Analysis of Neuronal Spike Trains, Deconstructed , 2016, Neuron.

[19]  R. Desimone,et al.  Predicting responses of nonlinear neurons in monkey striate cortex to complex patterns , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[20]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[21]  J. Kalaska,et al.  Proprioceptive activity in primate primary somatosensory cortex during active arm reaching movements. , 1994, Journal of neurophysiology.

[22]  Michael S. Lewicki,et al.  Efficient auditory coding , 2006, Nature.

[23]  J. Gallant,et al.  Complete functional characterization of sensory neurons by system identification. , 2006, Annual review of neuroscience.

[24]  Eero P. Simoncelli,et al.  Spatio-temporal correlations and visual signalling in a complete neuronal population , 2008, Nature.

[25]  Joshua I Glaser,et al.  Feature-based attention and spatial selection in frontal eye fields during natural scene search. , 2016, Journal of neurophysiology.

[26]  Marc W Slutzky,et al.  Statistical assessment of the stability of neural movement representations. , 2011, Journal of neurophysiology.

[27]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[28]  I. Nelken,et al.  Modeling the auditory scene: predictive regularity representations and perceptual objects , 2009, Trends in Cognitive Sciences.

[29]  E N Brown,et al.  A Statistical Paradigm for Neural Spike Train Decoding Applied to Position Prediction from Ensemble Firing Patterns of Rat Hippocampal Place Cells , 1998, The Journal of Neuroscience.

[30]  Eero P. Simoncelli,et al.  Spike-triggered neural characterization. , 2006, Journal of vision.

[31]  E J Chichilnisky,et al.  A simple white noise analysis of neuronal light responses , 2001, Network.

[32]  A. P. Georgopoulos,et al.  Neuronal population coding of movement direction. , 1986, Science.

[33]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[34]  N. Hatsopoulos,et al.  Encoding of Coordinated Reach and Grasp Trajectories in Primary Motor Cortex , 2012, The Journal of Neuroscience.

[35]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Eero P. Simoncelli,et al.  To appear in: The New Cognitive Neurosciences, 3rd edition Editor: M. Gazzaniga. MIT Press, 2004. Characterization of Neural Responses with Stochastic Stimuli , 2022 .

[37]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  William Bialek,et al.  Real-time performance of a movement-sensitive neuron in the blowfly visual system: coding and information transfer in short spike sequences , 1988, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[39]  F. Jäkel,et al.  Spatial four-alternative forced-choice method is the preferred psychophysical method for naïve observers. , 2006, Journal of vision.

[40]  David A. Bell,et al.  A Formalism for Relevance and Its Application in Feature Subset Selection , 2000, Machine Learning.

[41]  Ian H. Stevenson,et al.  Saliency and saccade encoding in the frontal eye field during natural scene search. , 2014, Cerebral cortex.

[42]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[43]  Jure Leskovec,et al.  Hidden factors and hidden topics: understanding rating dimensions with review text , 2013, RecSys.

[44]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[45]  G. Buzsáki,et al.  Theta Oscillations Provide Temporal Windows for Local Circuit Computation in the Entorhinal-Hippocampal Loop , 2009, Neuron.

[46]  Charu C. Aggarwal,et al.  Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , 2016, KDD.

[47]  Christopher. Simons,et al.  Machine learning with Python , 2017 .

[48]  Juliana Dushanova,et al.  Neurons in primary motor cortex engaged during action observation , 2010, The European journal of neuroscience.

[49]  P. Mahadevan,et al.  An overview , 2007, Journal of Biosciences.

[50]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[51]  Brian Lau,et al.  Computational subunits of visual cortical neurons revealed by artificial neural networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[52]  Bagrat Amirikian,et al.  Directional tuning profiles of motor cortical cells , 2000, Neuroscience Research.

[53]  L. Paninski Maximum likelihood estimation of cascade point-process neural encoding models , 2004, Network.

[54]  F. Windmeijer,et al.  An R-squared measure of goodness of fit for some common nonlinear regression models , 1997 .

[55]  L. Paninski,et al.  Superlinear Population Encoding of Dynamic Hand Trajectory in Primary Motor Cortex , 2004, The Journal of Neuroscience.

[56]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[57]  Eric J Perreault,et al.  Decoding with limited neural data: a mixture of time-warped trajectory models for directional reaches , 2012, Journal of neural engineering.

[58]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..