Learning potential-based policies from constrained motion

We present a method for learning potential-based policies from constrained motion data. In contrast to previous approaches to direct policy learning, our method can combine observations from a variety of contexts where different constraints are in force, to learn the underlying unconstrained policy in form of its potential function. This allows us to generalise and predict behaviour where novel constraints apply. As a key ingredient, we first create multiple simple local models of the potential, and align those using an efficient algorithm. We can then detect and discard unsuitable subsets of the data and learn a global model from a cleanly pre-processed training set. We demonstrate our approach on systems of varying complexity, including kinematic data from the ASIMO humanoid robot with 22 degrees of freedom.

[1]  Nikos A. Vlassis,et al.  Non-linear CCA and PCA by Alignment of Local Models , 2003, NIPS.

[2]  Jun Nakanishi,et al.  A unifying framework for robot control with redundant DOFs , 2007, Auton. Robots.

[3]  Yoshihiko Nakamura,et al.  Advanced robotics - redundancy and optimization , 1990 .

[4]  Stefan Schaal,et al.  Incremental Online Learning in High Dimensions , 2005, Neural Computation.

[5]  Michael Gienger,et al.  Real-time collision avoidance with whole body motion control for humanoid robots , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Richard M. Murray,et al.  A Mathematical Introduction to Robotic Manipulation , 1994 .

[7]  Sethu Vijayakumar,et al.  Reconstructing Null-space Policies Subject to Dynamic Task Constraints in Redundant Manipulators , 2007 .

[8]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[9]  Stefan Schaal,et al.  Learning to Control in Operational Space , 2008, Int. J. Robotics Res..

[10]  Chrystopher L. Nehaniv,et al.  Correspondence Mapping Induced State and Action Metrics for Robotic Imitation , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  Michael Gienger,et al.  Task-oriented whole body motion for humanoid robots , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[12]  Stefan Schaal,et al.  http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained , 2007 .

[13]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[14]  Thomas Hofmann,et al.  Learning Nonparametric Models for Probabilistic Imitation , 2007 .

[15]  A. Liegeois,et al.  Automatic supervisory control of the configuration and behavior of multi-body mechanisms , 1977 .

[16]  Oussama Khatib,et al.  Contact consistent control framework for humanoid robots , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..