Enabling hyperparameter optimization in sequential autoencoders for spiking neural data

Continuing advances in neural interfaces have enabled simultaneous monitoring of spiking activity from hundreds to thousands of neurons. To interpret these large-scale data, several methods have been proposed to infer latent dynamic structure from high-dimensional datasets. One recent line of work uses recurrent neural networks in a sequential autoencoder (SAE) framework to uncover dynamics. SAEs are an appealing option for modeling nonlinear dynamical systems, and enable a precise link between neural activity and behavior on a single-trial basis. However, the very large parameter count and complexity of SAEs relative to other models has caused concern that SAEs may only perform well on very large training sets. We hypothesized that with a method to systematically optimize hyperparameters (HPs), SAEs might perform well even in cases of limited training data. Such a breakthrough would greatly extend their applicability. However, we find that SAEs applied to spiking neural data are prone to a particular form of overfitting that cannot be detected using standard validation metrics, which prevents standard HP searches. We develop and test two potential solutions: an alternate validation method ("sample validation") and a novel regularization method ("coordinated dropout"). These innovations prevent overfitting quite effectively, and allow us to test whether SAEs can achieve good performance on limited data through large-scale HP optimization. When applied to data from motor cortex recorded while monkeys made reaches in various directions, large-scale HP optimization allowed SAEs to better maintain performance for small dataset sizes. Our results should greatly extend the applicability of SAEs in extracting latent dynamics from sparse, multidimensional data, such as neural population spiking activity.

[1]  Anqi Wu,et al.  Gaussian process based nonlinear latent structure discovery in multivariate spike train data , 2017, NIPS.

[2]  Uri Shalit,et al.  Structured Inference Networks for Nonlinear State Space Models , 2016, AAAI.

[3]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[4]  Yuan Zhao,et al.  Variational Latent Gaussian Process for Recovering Single-Trial Dynamics from Population Spike Trains , 2016, Neural Computation.

[5]  Yuan Zhao,et al.  Interpretable Nonlinear Dynamic Modeling of Neural Trajectories , 2016, NIPS.

[6]  Matthew T. Kaufman,et al.  Neural population dynamics during reaching , 2012, Nature.

[7]  Mónica F. Bugallo,et al.  Learning Structured Neural Dynamics From Single Trial Population Recording , 2018, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.

[8]  S. Wold Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models , 1978 .

[9]  Scott W. Linderman,et al.  Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems , 2017, AISTATS.

[10]  Maneesh Sahani,et al.  Temporal alignment and latent Gaussian process factor inference in population spike trains , 2018, bioRxiv.

[11]  Maneesh Sahani,et al.  Spectral learning of linear dynamics from generalised-linear observations with application to neural population data , 2012, NIPS.

[12]  Chethan Pandarinath,et al.  LFADS - Latent Factor Analysis via Dynamical Systems , 2016, ArXiv.

[13]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[14]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[15]  John P. Cunningham,et al.  Empirical models of spiking in neural populations , 2011, NIPS.

[16]  Chethan Pandarinath,et al.  Inferring single-trial neural population dynamics using sequential auto-encoders , 2017 .

[17]  Maneesh Sahani,et al.  Learning interpretable continuous-time models of latent stochastic dynamical systems , 2019, ICML.

[18]  Emilio Salinas,et al.  Decoding Vectorial Information from Firing Rates , 1995 .

[19]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[20]  Surya Ganguli,et al.  Unsupervised Discovery of Demixed, Low-Dimensional Neural Dynamics across Multiple Timescales through Tensor Component Analysis , 2017, Neuron.

[21]  John P. Cunningham,et al.  Linear dynamical neural population models through nonlinear embeddings , 2016, NIPS.

[22]  Max Jaderberg,et al.  Population Based Training of Neural Networks , 2017, ArXiv.

[23]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[24]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[25]  John P. Cunningham,et al.  Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity , 2008, NIPS.

[26]  Maximilian Karl,et al.  Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data , 2016, ICLR.