Motion learning in variable environments using probabilistic flow tubes

Commanding an autonomous system through complex motions at a low level can be tedious or impractical for systems with many degrees of freedom. Allowing an operator to demonstrate the desired motions directly can often enable more intuitive and efficient interaction. Two challenges in the field of learning from demonstration include (1) how to best represent learned motions to accurately reflect a human's intentions, and (2) how to enable learned motions to be easily applicable in new situations. This paper introduces a novel representation of continuous actions called probabilistic flow tubes that can provide flexibility during execution while robustly encoding a human's intended motions. Our approach also automatically determines certain qualitative characteristics of a motion so that these characteristics can be preserved when autonomously executing the motion in a new situation. We demonstrate the effectiveness of our motion learning approach both in a simulated two-dimensional environment and on the All-Terrain Hex-Limbed Extra-Terrestrial Explorer (ATHLETE) robot performing object manipulation tasks.

[1]  Sebastian Thrun,et al.  Apprenticeship learning for motion planning with application to parking lot navigation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Jochen J. Steil,et al.  Task-level imitation learning using variance-based movement optimization , 2009, 2009 IEEE International Conference on Robotics and Automation.

[3]  Stefan Schaal,et al.  Biologically-inspired dynamical systems for movement generation: Automatic real-time goal adaptation and obstacle avoidance , 2009, 2009 IEEE International Conference on Robotics and Automation.

[4]  Jochen J. Steil,et al.  Automatic selection of task spaces for imitation learning , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Stefan Schaal,et al.  Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[6]  Brian C. Williams,et al.  Generative Planning for Hybrid Systems Based on Flow Tubes , 2008, ICAPS.

[7]  Pieter Abbeel,et al.  Apprenticeship learning for helicopter control , 2009, CACM.

[8]  L. Rabiner,et al.  Performance trade‐offs in dynamic time warping algorithms for isolated word recognition , 1979 .

[9]  Mark W. Powell,et al.  Lessons Learned from All-Terrain Hex-Limbed Extra-Terrestrial Explorer Robot Field Test Operations at Moses Lake Sand Dunes, Washington , 2008 .

[10]  Richard Alan Peters,et al.  Robonaut task learning through teleoperation , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[11]  Brian Charles Williams,et al.  Exploiting Spatial and Temporal Flexibility for Exploiting Spatial and Temporal Flexibility for Plan Execution of Hybrid, Under-actuated Systems , 2010, Cognitive Robotics.

[12]  Pavel Senin,et al.  Dynamic Time Warping Algorithm Review , 2008 .

[13]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[14]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[15]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  Aaron E. Rosenberg,et al.  Performance tradeoffs in dynamic time warping algorithms for isolated word recognition , 1980 .

[17]  Aude Billard,et al.  Reaching with multi-referential dynamical systems , 2008, Auton. Robots.