Natural Walking With Musculoskeletal Models Using Deep Reinforcement Learning

Human gait optimality has been investigated recently, with the development of detailed musculoskeletal models, through trajectory optimization approaches or deep reinforcement learning (DRL). Trajectory optimization studies are limited by the trajectory length and can only generate open-loop solutions. While existing DRL solutions provide closed-loop control policies without trajectory length limit, they either do not evaluate the naturalness of the behaviour, or directly impose experimental tracking data. In this letter, a DRL-based approach is proposed with a nature-inspired curriculum learning (CL) scheme and a neuromechanically-inspired reward function. This approach generates close-to-natural human walking without the aid of experimental data. Our CL scheme is realized by an evolving reward function, targeting simpler behaviours such as standing and stepping first, then gradually refining the gait. The emerged gait from the closed-loop stochastic policy demonstrated a strong correlation with human gait kinematics, with Pearson correlations of 0.95 and 0.83 at the hip and knee, respectively, and higher gait symmetry than two other DRL-based control policies without CL. Our approach was also found to have efficient convergence to walking-capable policy. This approach can facilitate the development of assistive robotic systems by providing a “human” controller, and could enable decentralized adaptation between the agent and the assistive robotic devices.

[1]  Andy Ruina,et al.  Energetic Consequences of Walking Like an Inverted Pendulum: Step-to-Step Transitions , 2005, Exercise and sport sciences reviews.

[2]  Michael I. Jordan,et al.  Optimal feedback control as a theory of motor coordination , 2002, Nature Neuroscience.

[3]  Lorenz Wellhausen,et al.  Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.

[4]  D. Winter Kinematic and kinetic patterns in human gait: Variability and compensating effects , 1984 .

[5]  Sergey M. Plis,et al.  Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments , 2018, ArXiv.

[6]  Sergey Levine,et al.  Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation , 2020, Journal of NeuroEngineering and Rehabilitation.

[7]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[8]  Marko Ackermann,et al.  Optimality principles for model-based prediction of human gait. , 2010, Journal of biomechanics.

[9]  Scott L. Delp,et al.  OpenSim Moco: Musculoskeletal optimal control , 2019, bioRxiv.

[10]  Jeffery W. Rankin,et al.  The human foot and heel–sole–toe walking strategy: a mechanism enabling an inverted pendular gait with low isometric muscle force? , 2012, Journal of The Royal Society Interface.

[11]  Ayman Habib,et al.  OpenSim: Open-Source Software to Create and Analyze Dynamic Simulations of Movement , 2007, IEEE Transactions on Biomedical Engineering.

[12]  F. Prince,et al.  Symmetry and limb dominance in able-bodied gait: a review. , 2000, Gait & posture.

[13]  Kazuhito Yokoi,et al.  Biped walking pattern generation by using preview control of zero-moment point , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[14]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[15]  Christopher L. Dembia,et al.  Rapid predictive simulations with complex musculoskeletal models suggest that diverse healthy and pathological human gaits can emerge from similar control strategies , 2019, Journal of the Royal Society Interface.

[16]  Frank C. Sup,et al.  Bilevel Optimization for Cost Function Determination in Dynamic Simulation of Human Gait , 2019, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[17]  Vladlen Koltun,et al.  Animating human lower limbs using contact-invariant optimization , 2013, ACM Trans. Graph..

[18]  N. E. Toklu,et al.  Artificial Intelligence for Prosthetics - challenge solutions , 2019, The NeurIPS '18 Competition.

[19]  Ilse Jonkers,et al.  Physics-Based Simulations to Predict the Differential Effects of Motor Control and Musculoskeletal Deficits on Gait Dysfunction in Cerebral Palsy: A Retrospective Case Study , 2020, Frontiers in Human Neuroscience.

[20]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[21]  B. R. Umberger,et al.  Stance and swing phase costs in human walking , 2010, Journal of The Royal Society Interface.

[22]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[23]  Michiel van de Panne,et al.  ALLSTEPS: Curriculum‐driven Learning of Stepping Stone Skills , 2020, Comput. Graph. Forum.

[24]  J. Hidler,et al.  Biomechanics of overground vs. treadmill walking in healthy individuals. , 2008, Journal of applied physiology.

[25]  Sergey Levine,et al.  Learning to Run challenge: Synthesizing physiologically accurate motion using deep reinforcement learning , 2018, ArXiv.

[26]  Kyoungmin Lee,et al.  Scalable muscle-actuated human simulation and control , 2019, ACM Trans. Graph..

[27]  Frank C. Sup,et al.  Predictive Simulation of Human Walking Augmented by a Powered Ankle Exoskeleton , 2019, 2019 IEEE 16th International Conference on Rehabilitation Robotics (ICORR).