Learning and Transfer of Modulated Locomotor Controllers

We study a novel architecture and training procedure for locomotion tasks. A high-frequency, low-level "spinal" network with access to proprioceptive sensors learns sensorimotor primitives by training on simple tasks. This pre-trained module is fixed and connected to a low-frequency, high-level "cortical" network, with access to all sensors, which drives behavior by modulating the inputs to the spinal network. Where a monolithic end-to-end architecture fails completely, learning with a pre-trained spinal module succeeds at multiple high-level tasks, and enables the effective exploration required to learn from sparse rewards. We test our proposed architecture on three simulated bodies: a 16-dimensional swimming snake, a 20-dimensional quadruped, and a 54-dimensional humanoid. Our results are illustrated in the accompanying video at this https URL

[1]  N. A. Bernshteĭn The co-ordination and regulation of movements , 1967 .

[2]  M. Alexander Principles of Neural Science , 1986 .

[3]  Rodney A. Brooks,et al.  A robust layered control system for a mobile robot , 1986, IEEE J. Robotics Autom..

[4]  Geoffrey E. Hinton,et al.  Feudal Reinforcement Learning , 1992, NIPS.

[5]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[6]  G. E. Loeb,et al.  A hierarchical foundation for models of sensorimotor control , 1999, Experimental Brain Research.

[7]  E Bizzi,et al.  Motor learning through the combination of primitives. , 2000, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[8]  David Andre,et al.  State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.

[9]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[10]  Balaraman Ravindran,et al.  Relativized Options: Choosing the Right Transformation , 2003, ICML.

[11]  Emanuel Todorov,et al.  From task parameters to motor synergies: A hierarchical framework for approximately optimal control of redundant manipulators , 2005, J. Field Robotics.

[12]  Manfred Huber,et al.  Effective Control Knowledge Transfer through Learning Skill and Representation Hierarchies , 2007, IJCAI.

[13]  Andrew G. Barto,et al.  Efficient skill learning using abstraction selection , 2009, IJCAI 2009.

[14]  A. d’Avella,et al.  Locomotor Primitives in Newborn Babies and Their Development , 2011, Science.

[15]  Andrew G. Barto,et al.  Motor primitive discovery , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).

[16]  Bruno Castro da Silva,et al.  Learning parameterized motor skills on a humanoid robot , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[17]  L. F. Abbott,et al.  Hierarchical Control Using Networks Trained with Higher-Level Forward Models , 2014, Neural Computation.

[18]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[19]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2014, ICLR.

[20]  Yuval Tassa,et al.  Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.

[21]  Seungmoon Song,et al.  A neural circuitry that emphasizes spinal feedback generates diverse behaviours of human locomotion , 2015, The Journal of physiology.

[22]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[23]  Alex Graves,et al.  Strategic Attentive Writer for Learning Macro-Actions , 2016, NIPS.

[24]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[25]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[26]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[27]  Sergey Levine,et al.  Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.