Natural and Robust Walking using Reinforcement Learning without Demonstrations in High-Dimensional Musculoskeletal Models

Humans excel at robust bipedal walking in complex natural environments. In each step, they adequately tune the interaction of biomechanical muscle dynamics and neuronal signals to be robust against uncertainties in ground conditions. However, it is still not fully understood how the nervous system resolves the musculoskeletal redundancy to solve the multi-objective control problem considering stability, robustness, and energy efficiency. In computer simulations, energy minimization has been shown to be a successful optimization target, reproducing natural walking with trajectory optimization or reflex-based control methods. However, these methods focus on particular motions at a time and the resulting controllers are limited when compensating for perturbations. In robotics, reinforcement learning~(RL) methods recently achieved highly stable (and efficient) locomotion on quadruped systems, but the generation of human-like walking with bipedal biomechanical models has required extensive use of expert data sets. This strong reliance on demonstrations often results in brittle policies and limits the application to new behaviors, especially considering the potential variety of movements for high-dimensional musculoskeletal models in 3D. Achieving natural locomotion with RL without sacrificing its incredible robustness might pave the way for a novel approach to studying human walking in complex natural environments. Videos: https://sites.google.com/view/naturalwalkingrl

[1]  Vikash Kumar,et al.  MyoDex: A Generalizable Prior for Dexterous Manipulation , 2023, ICML.

[2]  A. Arami,et al.  Human Gait Cost Function Varies With Walking Speed: An Inverse Optimal Control Study , 2023, IEEE Robotics and Automation Letters.

[3]  Vikash Kumar,et al.  SAR: Generalization of Physiological Agility and Dexterity via Synergistic Action Representation , 2023, ArXiv.

[4]  Alessandro Marin Vargas,et al.  Latent Exploration for Reinforcement Learning , 2023, ArXiv.

[5]  F. Marques,et al.  A review on foot-ground contact modeling strategies for human motion analysis , 2022, Mechanism and Machine Theory.

[6]  M. Hutter,et al.  Advanced Skills by Learning Locomotion and Local Navigation End-to-End , 2022, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7]  G. Martius,et al.  Learning with Muscles: Benefits for Data-Efficiency and Robustness in Anthropomorphic Tasks , 2022, CoRL.

[8]  D. Haeufle,et al.  Evaluating anticipatory control strategies for their capability to cope with step-down perturbations in computer simulations of human walking , 2022, Scientific Reports.

[9]  G. Martius,et al.  DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems , 2022, ICLR.

[10]  Feryal M. P. Behbahani,et al.  Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality , 2022, ICLR.

[11]  Vikash Kumar,et al.  MyoSuite: A Contact-rich Simulation Suite for Musculoskeletal Motor Control , 2022, L4DC.

[12]  Yashraj S. Narang,et al.  Accelerated Policy Learning with Parallel Differentiable Simulation , 2022, ICLR.

[13]  Jehee Lee,et al.  Generative GaitNet , 2022, SIGGRAPH.

[14]  H. Geyer,et al.  A neuromuscular model of human locomotion combines spinal reflex circuits with voluntary movements , 2021, Scientific Reports.

[15]  Pieter Abbeel,et al.  PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training , 2021, ICML.

[16]  D. Lloyd,et al.  Evaluating cost function criteria in predicting healthy gait. , 2021, Journal of biomechanics.

[17]  Arash Arami,et al.  Natural Walking With Musculoskeletal Models Using Deep Reinforcement Learning , 2021, IEEE Robotics and Automation Letters.

[18]  Fabio Pardo Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking , 2020, ArXiv.

[19]  Sergey Levine,et al.  Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation , 2020, Journal of NeuroEngineering and Rehabilitation.

[20]  Kyoungmin Lee,et al.  Scalable muscle-actuated human simulation and control , 2019, ACM Trans. Graph..

[21]  Thomas Geijtenbeek,et al.  SCONE: Open Source Software for Predictive Simulation of Biological Motion , 2019, J. Open Source Softw..

[22]  Hartmut Geyer,et al.  The Benefit of Combining Neuronal Feedback and Feed-Forward Control for Robustness in Step Down Perturbations of Simulated Human Walking Depends on the Muscle Function , 2018, Front. Comput. Neurosci..

[23]  Ayman Habib,et al.  OpenSim: Simulating musculoskeletal dynamics and neuromuscular control to study human and animal movement , 2018, PLoS Comput. Biol..

[24]  Sergey M. Plis,et al.  Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments , 2018, ArXiv.

[25]  Hartmut Geyer,et al.  Evaluation of a Neuromechanical Walking Control Model Using Disturbance Experiments , 2017, Front. Comput. Neurosci..

[26]  Martin Kjaer Guul,et al.  Economy, Movement Dynamics, and Muscle Activity of Human Walking at Different Speeds , 2017, Scientific Reports.

[27]  A. Amis,et al.  The influence of muscle pennation angle and cross-sectional area on contact forces in the ankle joint , 2016, The Journal of strain analysis for engineering design.

[28]  Christopher L. Dembia,et al.  Full-Body Musculoskeletal Model for Muscle-Driven Simulation of Human Gait , 2016, IEEE Transactions on Biomedical Engineering.

[29]  Jessica C. Selinger,et al.  Humans Can Continuously Optimize Energetic Cost during Walking , 2015, Current Biology.

[30]  D. Abe,et al.  Economical Speed and Energetically Optimal Transition Speed Evaluated by Gross and Net Oxygen Cost of Transport at Different Gradients , 2015, PloS one.

[31]  Michiel van de Panne,et al.  Flexible muscle-based locomotion for bipedal creatures , 2013, ACM Trans. Graph..

[32]  Hartmut Geyer,et al.  Generalization of a muscle-reflex control model to 3D walking , 2013, 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[33]  Jill S Higginson,et al.  Stabilisation of walking by intrinsic muscle properties revealed in a three-dimensional muscle-driven simulation , 2013, Computer methods in biomechanics and biomedical engineering.

[34]  Samuel R. Hamner,et al.  Muscle contributions to fore-aft and vertical body mass center accelerations over a range of running speeds. , 2013, Journal of biomechanics.

[35]  Matthew Millard,et al.  Flexing computational muscle: modeling and simulation of musculotendon dynamics. , 2013, Journal of biomechanical engineering.

[36]  Vladlen Koltun,et al.  Optimizing locomotion controllers using biologically-based actuators and objectives , 2012, ACM Trans. Graph..

[37]  Francesco Nori,et al.  Evidence for Composite Cost Functions in Arm Movement Planning: An Inverse Optimal Control Approach , 2011, PLoS Comput. Biol..

[38]  O. O’Reilly,et al.  A Musculoskeletal model for the lumbar spine , 2011, Biomechanics and Modeling in Mechanobiology.

[39]  Marko Ackermann,et al.  Optimality principles for model-based prediction of human gait. , 2010, Journal of biomechanics.

[40]  H. Geyer,et al.  A Muscle-Reflex Model That Encodes Principles of Legged Mechanics Produces Human Walking Dynamics and Muscle Activities , 2010, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[41]  A. Seyfarth,et al.  The role of intrinsic muscle properties for stable hopping—stability is achieved by the force–velocity relation , 2010, Bioinspiration & biomimetics.

[42]  Betty J. Mohler,et al.  Visual flow influences gait transition speed and preferred walking speed , 2007, Experimental Brain Research.

[43]  P. Komi,et al.  Muscle-tendon interaction and elastic energy usage in human walking. , 2005, Journal of applied physiology.

[44]  A. J. van den Bogert,et al.  Intrinsic muscle properties facilitate locomotor control - a computer simulation study. , 1998, Motor control.

[45]  F.E. Zajac,et al.  An interactive graphics-based model of the lower extremity to study orthopaedic surgical procedures , 1990, IEEE Transactions on Biomedical Engineering.

[46]  A. Thorstensson,et al.  Ground reaction forces at different speeds of human walking and running. , 1989, Acta physiologica Scandinavica.

[47]  K. H. Hunt,et al.  Coefficient of Restitution Interpreted as Damping in Vibroimpact , 1975 .

[48]  E. Papadopoulos,et al.  Learning Energy-Efficient Trotting for Legged Robots , 2022, CLAWAR.

[49]  M. Ferrarin,et al.  A multiple-task gait analysis approach: kinematic, kinetic and EMG reference data for healthy young and adult subjects. , 2011, Gait & posture.

[50]  Michael A Sherman,et al.  Simbody: multibody dynamics for biomedical research. , 2011, Procedia IUTAM.