Tendon-driven control of biomechanical and robotic systems: A path integral reinforcement learning approach

We apply path integral reinforcement learning to a biomechanically accurate dynamics model of the index finger and then to the Anatomically Correct Testbed (ACT) robotic hand. We illustrate the applicability of Policy Improvement with Path Integrals (PI2) to parameterized and non-parameterized control policies. This method is based on sampling variations in control, executing them in the real world, and minimizing a cost function on the resulting performance. Iteratively improving the control policy based on real-world performance requires no direct modeling of tendon network nonlinearities and contact transitions, allowing improved task performance.

[1]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[2]  W. Fleming,et al.  Controlled Markov processes and viscosity solutions , 1992 .

[3]  Robert F. Stengel,et al.  Optimal Control and Estimation , 1994 .

[4]  F. Zajac,et al.  Large index-fingertip forces are produced by subject-independent patterns of muscle excitation. , 1998, Journal of biomechanics.

[5]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[6]  Michael Vande Weghe,et al.  An extensor mechanism for an anatomical robotic hand , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[7]  Michael Vande Weghe,et al.  The ACT Hand: design of the skeletal structure , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[8]  Emanuel Todorov,et al.  From task parameters to motor synergies: A hierarchical framework for approximately optimal control of redundant manipulators , 2005, J. Field Robotics.

[9]  Y. Matsuoka,et al.  Understanding variable moment arms for the index finger MCP joints through the ACT hand , 2008, 2008 2nd IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics.

[10]  Stefan Schaal,et al.  2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[11]  Dieter Fox,et al.  Anatomically correct testbed hand control: Muscle and joint control strategies , 2009, 2009 IEEE International Conference on Robotics and Automation.

[12]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[13]  Evangelos A. Theodorou,et al.  Iterative path integral stochastic optimal control: Theory and applications to motor control , 2011 .