Active Learning of Probabilistic Movement Primitives

A Probabilistic Movement Primitive (ProMP) defines a distribution over trajectories with an associated feedback policy. ProMPs are typically initialized from human demonstrations and achieve task generalization through probabilistic operations. However, there is currently no principled guidance in the literature to determine how many demonstrations a teacher should provide and what constitutes a “good” demonstration for promoting generalization. In this paper, we present an active learning approach to learning a library of ProMPs capable of task generalization over a given space. We utilize uncertainty sampling techniques to generate a task instance for which a teacher should provide a demonstration. The provided demonstration is incorporated into an existing ProMP if possible, or a new ProMP is created from the demonstration if it is determined that it is too dissimilar from existing demonstrations. We provide a qualitative comparison between common active learning metrics; motivated by this comparison we present a novel uncertainty sampling approach named “Greatest Mahalanobis Distance.” We perform grasping experiments on a real KUKA robot and show our novel active learning measure achieves better task generalization with fewer demonstrations than a random sampling over the space.

[1]  Jan Peters,et al.  Context-driven movement primitive adaptation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Bernhard Schölkopf,et al.  Using probabilistic movement primitives for striking movements , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[3]  Jan Peters,et al.  Learning multiple collaborative tasks with a mixture of Interaction Primitives , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[4]  David Silver,et al.  Active learning from demonstration for robust autonomous navigation , 2012, 2012 IEEE International Conference on Robotics and Automation.

[5]  Thomas G. Dietterich,et al.  Active Imitation Learning via Reduction to I.I.D. Active Learning , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.

[6]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7]  Jan Peters,et al.  Using probabilistic movement primitives in robotics , 2017, Autonomous Robots.

[8]  Jan Peters,et al.  Probabilistic movement primitives under unknown system dynamics , 2018, Adv. Robotics.

[9]  Bruno Castro da Silva,et al.  Active Learning of Parameterized Skills , 2014, ICML.

[10]  Stefan Schaal,et al.  Probabilistic object tracking using a range camera , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Siddhartha S. Srinivasa,et al.  The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[12]  Oliver Kroemer,et al.  Combining active learning and reactive control for robot grasping , 2010, Robotics Auton. Syst..

[13]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[14]  Manuela M. Veloso,et al.  Interactive Policy Learning through Confidence-Based Autonomy , 2014, J. Artif. Intell. Res..

[15]  Patrick van der Smagt,et al.  Active Learning based on Data Uncertainty and Model Sensitivity , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[17]  Alexander Fabisch,et al.  Active contextual policy search , 2014, J. Mach. Learn. Res..

[18]  Jan Peters,et al.  Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.

[19]  Christophe Ley,et al.  Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median , 2013 .

[20]  Jan Peters,et al.  Active Incremental Learning of Robot Movement Primitives , 2017, CoRL.

[21]  Robert D. Howe,et al.  A compliant, underactuated hand for robust manipulation , 2013, Int. J. Robotics Res..

[22]  Carme Torras,et al.  Demonstration-free contextualized probabilistic movement primitives, further enhanced with obstacle avoidance , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[23]  Robert D. Howe,et al.  Limits to compliance and the role of tactile sensing in grasping , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[25]  Tucker Hermans,et al.  Relaxed-rigidity constraints: kinematic trajectory optimization and collision avoidance for in-grasp manipulation , 2018, Autonomous Robots.

[26]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[27]  Jan Peters,et al.  Online Learning of an Open-Ended Skill Library for Collaborative Tasks , 2018, 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids).

[28]  Rajesh P. N. Rao,et al.  Active Imitation Learning , 2007, AAAI.

[29]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[30]  Herman Bruyninckx,et al.  Open robot control software: the OROCOS project , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[31]  Jan Peters,et al.  Probabilistic Prioritization of Movement Primitives , 2017, IEEE Robotics and Automation Letters.

[32]  Bernhard Schölkopf,et al.  Adaptation and Robust Learning of Probabilistic Movement Primitives , 2018, IEEE Transactions on Robotics.

[33]  Daniel H. Grollman,et al.  Dogged Learning for Robots , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[34]  Sylvain Calinon,et al.  Supervisory teleoperation with online learning and optimal control , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[35]  David Cohn,et al.  Active Learning , 2010, Encyclopedia of Machine Learning.

[36]  Pierre-Yves Oudeyer,et al.  Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..

[37]  Henk Nijmeijer,et al.  Robot Programming by Demonstration , 2010, SIMPAR.