Task Space Retrieval Using Inverse Feedback Control

Learning complex skills by repeating and generalizing expert behavior is a fundamental problem in robotics. A common approach is learning from demonstration: given examples of correct motions, learn a policy mapping state to action consistent with the training data. However, the usual approaches do not answer the question of what are appropriate representations to generate motions for specific tasks. Inspired by Inverse Optimal Control, we present a novel method to learn latent costs, imitate and generalize demonstrated behavior, and discover a task relevant motion representation: Task Space Retrieval Using Inverse Feedback Control (TRIC). We use the learned latent costs to create motion with a feedback controller. We tested our method on robot grasping of objects, a challenging high-dimensional task. TRIC learns the important control dimensions for the grasping task from a few example movements and is able to robustly approach and grasp objects in new situations.

[1]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[2]  Constantine Kotropoulos,et al.  Feature Selection Based on Mutual Correlation , 2006, CIARP.

[3]  Fu Jie Huang,et al.  A Tutorial on Energy-Based Learning , 2006 .

[4]  Gordon Cheng,et al.  Discovering optimal imitation strategies , 2004, Robotics Auton. Syst..

[5]  Marc Toussaint,et al.  Gaussian process implicit surfaces for shape estimation and grasping , 2011, 2011 IEEE International Conference on Robotics and Automation.

[6]  Siddhartha S. Srinivasa,et al.  Inverse Optimal Heuristic Control for Imitation Learning , 2009, AISTATS.

[7]  Jun Nakanishi,et al.  Learning Movement Primitives , 2005, ISRR.

[8]  Oliver Kroemer,et al.  Grasping with Vision Descriptors and Motor Primitives , 2010, ICINCO.

[9]  Jochen J. Steil,et al.  Automatic selection of task spaces for imitation learning , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Danica Kragic,et al.  Demonstration-based learning and control for automatic grasping , 2009, Intell. Serv. Robotics.

[11]  Marc Toussaint,et al.  Optimization of fluent approach and grasp motions , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[12]  Dean Pomerleau,et al.  Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.

[13]  Aude Billard,et al.  Incremental learning of gestures by imitation in a humanoid robot , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[14]  Sethu Vijayakumar,et al.  A novel method for learning policies from variable constraint data , 2009, Auton. Robots.

[15]  J. Andrew Bagnell,et al.  Maximum margin planning , 2006, ICML.

[16]  Maja J. Mataric,et al.  A spatio-temporal extension to Isomap nonlinear dimension reduction , 2004, ICML.