Deep neuroethology of a virtual rodent

Parallel developments in neuroscience and deep learning have led to mutually productive exchanges, pushing our understanding of real and artificial neural networks in sensory and cognitive systems. However, this interaction between fields is less developed in the study of motor control. In this work, we develop a virtual rodent as a platform for the grounded study of motor activity in artificial models of embodied control. We then use this platform to study motor activity across contexts by training a model to solve four complex tasks. Using methods familiar to neuroscientists, we describe the behavioral representations and algorithms employed by different layers of the network using a neuroethological approach to characterize motor activity relative to the rodent's behavior and goals. We find that the model uses two classes of representations which respectively encode the task-specific behavioral strategies and task-invariant behavioral kinematics. These representations are reflected in the sequential activity and population dynamics of neural subpopulations. Overall, the virtual rodent facilitates grounded collaborations between deep reinforcement learning and motor neuroscience.

[1]  N. Heglund,et al.  Speed, stride frequency and energy cost per stride: how do they change with body size and gait? , 1988, The Journal of experimental biology.

[2]  A. C. Yu,et al.  Temporal Hierarchical Control of Singing in Birds , 1996, Science.

[3]  J. Mink THE BASAL GANGLIA: FOCUSED SELECTION AND INHIBITION OF COMPETING MOTOR PROGRAMS , 1996, Progress in Neurobiology.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Randall D. Beer,et al.  The brain has a body: adaptive behavior emerges from interactions of nervous system, body and environment , 1997, Trends in Neurosciences.

[6]  Richard Hans Robert Hahnloser,et al.  An ultra-sparse code underliesthe generation of neural sequences in a songbird , 2002, Nature.

[7]  Y. Lazebnik Can a biologist fix a radio? — or, what I learned while studying apoptosis , 2004, Biochemistry (Moscow).

[8]  M. Graziano The organization of behavioral repertoire in motor cortex. , 2006, Annual review of neuroscience.

[9]  A. Ijspeert,et al.  From Swimming to Walking with a Salamander Robot Driven by a Spinal Cord Model , 2007, Science.

[10]  Paolo Dario,et al.  Modeling a vertebrate motor system: pattern generation, steering and control of body orientation. , 2007, Progress in brain research.

[11]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[12]  Greg J. Stephens,et al.  Dimensionality and Dynamics in the Behavior of C. elegans , 2008, PLoS Comput. Biol..

[13]  Nikolaus Kriegeskorte,et al.  Representational Similarity Analysis – Connecting the Branches of Systems Neuroscience , 2008, Frontiers in systems neuroscience.

[14]  H. Eichenbaum,et al.  Striatal versus hippocampal representations during win-stay maze performance. , 2009, Journal of neurophysiology.

[15]  J. Kalaska From intention to action: motor cortex and the control of reaching movements. , 2009, Advances in experimental medicine and biology.

[16]  Andrew M. Clark,et al.  Stimulus onset quenches neural variability: a widespread cortical phenomenon , 2010, Nature Neuroscience.

[17]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  MM Churchland,et al.  Neural population dynamics during reaching , 2012, Nature.

[19]  T. Lillicrap,et al.  Preference Distributions of Primary Motor Cortex Neurons Reflect Control Solutions Optimized for Limb Biomechanics , 2013, Neuron.

[20]  Ha Hong,et al.  Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.

[21]  A. Renart,et al.  Variability in neural activity and behavior , 2014, Current Opinion in Neurobiology.

[22]  William Bialek,et al.  Mapping the stereotyped behaviour of freely moving fruit flies , 2013, Journal of The Royal Society Interface.

[23]  Nikolaus Kriegeskorte,et al.  Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation , 2014, PLoS Comput. Biol..

[24]  Ashesh K Dhawale,et al.  Motor Cortex Is Required for Learning but Not for Executing a Motor Skill , 2015, Neuron.

[25]  P. Rueda-Orozco,et al.  The striatum multiplexes contextual and kinematic information to constrain motor habits execution , 2014, Nature Neuroscience.

[26]  Matthew T. Kaufman,et al.  A neural network that finds a naturalistic solution for the production of muscle activity , 2015, Nature Neuroscience.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  H. Francis Song,et al.  Reward-based training of recurrent neural networks for cognitive and value-based tasks , 2016, bioRxiv.

[29]  H. Akhlaghpour,et al.  Dissociated sequential activity and stimulus encoding in the dorsomedial striatum during spatial working memory , 2016, eLife.

[30]  Glen Berseth,et al.  Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..

[31]  Dario Floreano,et al.  Climbing favours the tripod gait over alternative faster insect gaits , 2017, Nature communications.

[32]  Xiao-Jing Wang,et al.  Reward-based training of recurrent neural networks for cognitive and value-based tasks , 2017, eLife.

[33]  Jonas Kubilius,et al.  Toward Goal-Driven Neural Network Models for the Rodent Whisker-Trigeminal System , 2017, NIPS.

[34]  Glen Berseth,et al.  DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning , 2017, ACM Trans. Graph..

[35]  Yuval Tassa,et al.  Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.

[36]  Konrad Paul Körding,et al.  Could a Neuroscientist Understand a Microprocessor? , 2017, PLoS Comput. Biol..

[37]  Daniel L. K. Yamins,et al.  A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy , 2018, Neuron.

[38]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[39]  Xue-Xin Wei,et al.  Emergence of grid-like representations by training recurrent neural networks to perform spatial localization , 2018, ICLR.

[40]  Yuval Tassa,et al.  Maximum a Posteriori Policy Optimisation , 2018, ICLR.

[41]  Yuval Tassa,et al.  DeepMind Control Suite , 2018, ArXiv.

[42]  Razvan Pascanu,et al.  Vector-based navigation using grid-like representations in artificial agents , 2018, Nature.

[43]  Andrew Zisserman,et al.  Kickstarting Deep Reinforcement Learning , 2018, ArXiv.

[44]  Stefan Schaffelhofer,et al.  A neural network model of flexible grasp movement generation , 2019, bioRxiv.

[45]  Xiao-Jing Wang,et al.  Task representations in neural networks trained to perform many cognitive tasks , 2019, Nature Neuroscience.

[46]  Geoffrey E. Hinton,et al.  Similarity of Neural Network Representations Revisited , 2019, ICML.

[47]  Yee Whye Teh,et al.  Neural probabilistic motor primitives for humanoid control , 2018, ICLR.

[48]  Nicolas Heess,et al.  Hierarchical visuomotor control of humanoids , 2018, ICLR.

[49]  Greg Wayne,et al.  Hierarchical motor control in mammals and machines , 2019, Nature Communications.

[50]  A. Ijspeert,et al.  Reverse-engineering the locomotion of a stem amniote , 2019, Nature.

[51]  Ashesh K Dhawale,et al.  The basal ganglia can control learned motor sequences independently of motor cortex , 2019 .

[52]  Jörn Diedrichsen,et al.  Peeling the Onion of Brain Representations. , 2019, Annual review of neuroscience.

[53]  Jonas Kubilius,et al.  Evidence that recurrent circuits are critical to the ventral stream’s execution of core object recognition behavior , 2019, Nature Neuroscience.

[54]  H. Francis Song,et al.  V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control , 2020, ICLR.