Learning to Control Redundant Musculoskeletal Systems with Neural Networks and SQP: Exploiting Muscle Properties

Modeling biomechanical musculoskeletal systems reveals that the mapping from muscle stimulations to movement dynamics is highly nonlinear and complex, which makes it difficult to control those systems with classical techniques. In this work, we not only investigate whether machine learning approaches are capable of learning a controller for such systems. We are especially interested in the question if the structure of the musculoskeletal apparatus exhibits properties that are favorable for the learning task. In particular, we consider learning a control policy from target positions to muscle stimulations. To account for the high actuator redundancy of biomechanical systems, our approach uses a learned forward model represented by a neural network and sequential quadratic programming to obtain the control policy, which also enables us to alternate the co-contraction level and hence allows to change the stiffness of the system and to include optimality criteria like small muscle stimulations. Experiments on both a simulated musculoskeletal model of a human arm and a real biomimetic muscle-driven robot show that our approach is able to learn an accurate controller despite high redundancy and nonlinearity, while retaining sample efficiency.

[1]  S Schmitt,et al.  Hill-type muscle model with serial damping and eccentric force-velocity relation. , 2014, Journal of biomechanics.

[2]  Patrick van der Smagt,et al.  Neural Network Control of a Pneumatic Robot Arm , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[3]  Takamitsu Matsubara,et al.  Pneumatic artificial muscle-driven robot control using local update reinforcement learning , 2017, Adv. Robotics.

[4]  Peter Englert,et al.  Active learning with query paths for tactile object shape exploration , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Dan Liu,et al.  Hierarchical optimal control of a 7-DOF arm model , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.

[6]  Bernhard Schölkopf,et al.  Learning inverse kinematics with structured prediction , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Blake Hannaford,et al.  McKibben artificial muscles: pneumatic actuators with biomechanical intelligence , 1999, 1999 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (Cat. No.99TH8399).

[8]  S Schmitt,et al.  Quantifying control effort of biological and technical movements: an information-entropy-based approach. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Taesoo Kwon,et al.  Locomotion control for many-muscle humanoids , 2014, ACM Trans. Graph..

[10]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[11]  Stefan Schaal,et al.  Learning inverse kinematics , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[12]  Jochen J. Steil,et al.  Online Goal Babbling for rapid bootstrapping of inverse models in high dimensions , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[13]  D. F. B. Haeufle,et al.  The influence of biophysical muscle properties on simulating fast human arm movements , 2017, Computer methods in biomechanics and biomedical engineering.

[14]  K. Schulten,et al.  Control of pneumatic robot arm dynamics by a neural networkPatrick , 1994 .

[15]  H. Hatze A general myocybernetic control model of skeletal muscle , 1978, Biological Cybernetics.

[16]  Reza Shadmehr,et al.  Actuator and kinematic redundancy in biological motor control , 1991 .

[17]  Michael Günther,et al.  Electro-mechanical delay in Hill-type muscle models , 2012 .

[18]  Peter Englert,et al.  Constrained Bayesian optimization of combined interaction force/task space controllers for manipulations , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[20]  Andreas Krause,et al.  Safe controller optimization for quadrotors with Gaussian processes , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Stefan Schaal,et al.  Learning to Control in Operational Space , 2008, Int. J. Robotics Res..