Towards Imitation Learning of Dynamic Manipulation Tasks: A Framework to Learn from Failures

We present an imitation learning approach for a dynamic fluid pouring task. Our approach allows learning from errors made by humans and how they recovered from these error s subsequently. We collect both successful and failed human demonstrations of the task. Our algorithm combines a s upport vector machine based classifier and iterative search to generate initial task parameters for the robot. Ne xt, a refinement algorithm, capturing how demonstrators change parameters to transition from failure to success, en abl s the robot to address failures. Experimental results with a physical robot are reported to illustrate our approac h.

[1]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[2]  Jun Morimoto,et al.  Task-Specific Generalization of Discrete and Periodic Dynamic Movement Primitives , 2010, IEEE Transactions on Robotics.

[3]  Yi Wang,et al.  Discriminative Apprenticeship Learning with Both Preference and Non-preference Behavior , 2013, 2013 12th International Conference on Machine Learning and Applications.

[4]  Monica N. Nicolescu,et al.  Natural methods for robot task learning: instructive demonstrations, generalization and practice , 2003, AAMAS '03.

[5]  Perry Y. Li,et al.  Motion Planning and Control of a Swimming Machine , 2004, Int. J. Robotics Res..

[6]  Oliver Kroemer,et al.  A kernel-based approach to direct action perception , 2012, 2012 IEEE International Conference on Robotics and Automation.

[7]  Aude Billard,et al.  Robot Learning from Failed Demonstrations , 2012, Int. J. Soc. Robotics.

[8]  Jan Peters,et al.  Policy Search for Motor Primitives in Robotics , 2008, NIPS 2008.

[9]  Sebastian Thrun,et al.  Apprenticeship learning for motion planning with application to parking lot navigation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[11]  Darwin G. Caldwell,et al.  Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning , 2013, Robotics Auton. Syst..

[12]  Stefan Schaal,et al.  Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[13]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[14]  Ales Ude,et al.  Efficient sensorimotor learning from multiple demonstrations , 2013, Adv. Robotics.

[15]  Stefano Caselli,et al.  Robust trajectory learning and approximation for robot programming by demonstration , 2006, Robotics Auton. Syst..

[16]  Perry Y. Li,et al.  Motion Planning and Control of a Swimming Machine , 2004, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).

[17]  Henk Nijmeijer,et al.  Robot Programming by Demonstration , 2010, SIMPAR.

[18]  Thorsten Joachims,et al.  Learning Trajectory Preferences for Manipulators via Iterative Improvement , 2013, NIPS.

[19]  Aude Billard,et al.  Learning Stable Nonlinear Dynamical Systems With Gaussian Mixture Models , 2011, IEEE Transactions on Robotics.

[20]  Andrea Lockerd Thomaz,et al.  Using perspective taking to learn from ambiguous demonstrations , 2006, Robotics Auton. Syst..

[21]  Aude Billard,et al.  Learning from failed demonstrations in unreliable systems , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[22]  A. Meltzoff Understanding the Intentions of Others: Re-Enactment of Intended Acts by 18-Month-Old Children. , 1995, Developmental psychology.

[23]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[24]  Maya Cakmak,et al.  Keyframe-based Learning from Demonstration , 2012, Int. J. Soc. Robotics.

[25]  Stefan Schaal,et al.  Robot Program 59. Robot Programming by Demonstration , 2008 .

[26]  Jun Nakanishi,et al.  Movement imitation with nonlinear dynamical systems in humanoid robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[27]  Stephen C. Want,et al.  Learning from other people's mistakes: causal understanding in learning to use a tool. , 2001, Child development.

[28]  E. Menegatti,et al.  Robot learning by observing humans activities and modeling failures , 2013 .

[29]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[30]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[31]  K. Dautenhahn,et al.  The correspondence problem , 2002 .