Approximate policy transfer applied to simulated Bongo Board balance

Developing global policies for humanoid robots using dynamic programming is difficult because they have many degrees of freedom. We present a formalism whereby a value function for a humanoid robot can be approximated using the known value functions of similar systems. These similar systems can include approximate models of the robot with reduced dimensionality or trajectories derived from human motion capture data. Once an approximate value function is known, a local controller is used to compute control signals. The approximate value function provides information about the global strategies that should be used to solve the task. The local controller provides complementary information about the robots dynamics. We present an implementation of this strategy and simulation results generated by this implementation.

[1]  R. Blickhan,et al.  Similarity in multilegged locomotion: Bouncing like a monopode , 1993, Journal of Comparative Physiology A.

[2]  R. Peterka Sensorimotor integration in human postural control. , 2002, Journal of neurophysiology.

[3]  Zoran Popovic,et al.  Physically based motion transformation , 1999, SIGGRAPH.

[4]  Miomir Vukobratovic,et al.  How to Control Artificial Anthropomorphic Systems , 1973, IEEE Trans. Syst. Man Cybern..

[5]  Marc H. Raibert,et al.  Legged Robots That Balance , 1986, IEEE Expert.

[6]  Jessica K. Hodgins,et al.  Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces , 2004, SIGGRAPH 2004.

[7]  Nancy S. Pollard,et al.  Evaluating motion graphs for character animation , 2007, TOGS.

[8]  Christopher G. Atkeson,et al.  Random Sampling of States in Dynamic Programming , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Jun Morimoto,et al.  Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach , 2002, NIPS.

[10]  Nancy S. Pollard,et al.  Evaluating motion graphs for character navigation , 2004, SCA '04.

[11]  James K. Hahn,et al.  Interpolation synthesis for articulated figure motion , 1997, Proceedings of IEEE 1997 Annual International Symposium on Virtual Reality.

[12]  Oussama Khatib,et al.  A whole-body control framework for humanoids operating in human environments , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[13]  Chee-Meng Chew,et al.  Virtual Model Control: An Intuitive Approach for Bipedal Locomotion , 2001, Int. J. Robotics Res..

[14]  S.O. Anderson,et al.  Identifying trajectory classes in dynamic tasks , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[15]  Garth Zeglin,et al.  Powered bipeds based on passive dynamic principles , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[16]  Jessica K. Hodgins,et al.  Motion capture-driven simulations that hit and react , 2002, SCA '02.

[17]  Daniel E. Koditschek,et al.  RHex: A Simple and Highly Mobile Hexapod Robot , 2001, Int. J. Robotics Res..

[18]  C. Atkeson,et al.  Minimax differential dynamic programming: application to a biped walking robot , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).