A deep reinforcement learning based approach towards generating human walking behavior with a neuromuscular model

A gait model capable of generating human-like walking behavior at both the kinematic and the muscular level can be a very useful framework for developing control schemes for humanoids and wearable robots such as exoskeletons and prostheses. In this work we demonstrated the feasibility of using deep reinforcement learning based approach for neuromuscular gait modelling. A lower limb gait model consists of seven segments, fourteen degrees of freedom, and twenty two Hill-type muscles was built to capture human leg dynamics and the characteristics of muscle properties. We implemented the proximal policy optimization algorithm to learn the sensory-motor mappings (control policy) and generate human-like walking behavior for the model. Human motion capture data, muscle activation patterns and metabolic cost estimation were included in the reward function for training. The results show that the model can closely reproduce the human kinematics and ground reaction forces during walking. It is capable of generating human walking behavior in a speed range from 0.6 m/s to 1.2 m/s. It is also able to withstand unexpected hip torque perturbations during walking. We further explored the advantages of using the neuromuscular based model over the ideal joint torque based model. We observed that the neuromuscular model is more sample efficient compared to the torque model.

[1]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[2]  S. Hasan,et al.  Relationship between vertical ground reaction force and speed during walking, slow jogging, and running. , 1996, Clinical biomechanics.

[3]  Michiel van de Panne,et al.  Learning locomotion skills using DeepRL: does the choice of action space matter? , 2016, Symposium on Computer Animation.

[4]  Vladlen Koltun,et al.  Optimizing locomotion controllers using biologically-based actuators and objectives , 2012, ACM Trans. Graph..

[5]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[6]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[7]  R. Blickhan The spring-mass model for running and hopping. , 1989, Journal of biomechanics.

[8]  H. Ralston,et al.  Optimization of energy expenditure during level walking , 2004, European Journal of Applied Physiology and Occupational Physiology.

[9]  Alexander Spröwitz,et al.  ATRIAS: Design and validation of a tether-free 3D-capable spring-mass bipedal robot , 2016, Int. J. Robotics Res..

[10]  Nicolas Heess,et al.  Hierarchical visuomotor control of humanoids , 2018, ICLR.

[11]  Tad McGeer,et al.  Passive Dynamic Walking , 1990, Int. J. Robotics Res..

[12]  Reinhard Blickhan,et al.  Compliant leg behaviour explains basic dynamics of walking and running , 2006, Proceedings of the Royal Society B: Biological Sciences.

[13]  J. Schröder,et al.  Improved control of a humanoid arm driven by pneumatic actuators , 2003 .

[14]  R. Marshall,et al.  Relationships between ground reaction force impulse and kinematics of sprint-running acceleration. , 2005, Journal of applied biomechanics.

[15]  Ana Lucia Cruz Ruiz,et al.  Muscle‐Based Control for Character Animation , 2017, Comput. Graph. Forum.

[16]  Glen Berseth,et al.  DeepLoco , 2017, ACM Trans. Graph..

[17]  C. Karen Liu,et al.  Synthesis of biologically realistic human motion using joint torque actuation , 2019, ACM Trans. Graph..

[18]  Taku Komura,et al.  A Muscle‐based Feed‐forward Controller of the Human Body , 1997, Comput. Graph. Forum.

[19]  S. Simon Gait Analysis, Normal and Pathological Function. , 1993 .

[20]  Nando de Freitas,et al.  Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.

[21]  T. Takenaka,et al.  The development of Honda humanoid robot , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[22]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[23]  R. Alexander Optimum Muscle Design for Oscillatory Movements. , 1997, Journal of theoretical biology.

[24]  Vladlen Koltun,et al.  Animating human lower limbs using contact-invariant optimization , 2013, ACM Trans. Graph..

[25]  Hartmut Geyer,et al.  Evaluation of a Neuromechanical Walking Control Model Using Disturbance Experiments , 2017, Front. Comput. Neurosci..

[26]  Seungmoon Song,et al.  A neural circuitry that emphasizes spinal feedback generates diverse behaviours of human locomotion , 2015, The Journal of physiology.

[27]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[29]  Taesoo Kwon,et al.  Locomotion control for many-muscle humanoids , 2014, ACM Trans. Graph..

[30]  Michiel van de Panne,et al.  Flexible muscle-based locomotion for bipedal creatures , 2013, ACM Trans. Graph..

[31]  Xiangrong Shen,et al.  Nonlinear model-based control of pneumatic artificial muscle servo systems , 2010 .

[32]  Edwin van Asseldonk,et al.  Template model inspired leg force feedback based control can assist human walking , 2017, 2017 International Conference on Rehabilitation Robotics (ICORR).

[33]  Russ Tedrake,et al.  Efficient Bipedal Robots Based on Passive-Dynamic Walkers , 2005, Science.