Learning Trajectory Distributions for Assisted Teleoperation and Path Planning

Several approaches have been proposed to assist humans in co-manipulation and teleoperation tasks given demonstrated trajectories. However, these approaches are not applicable when the demonstrations are suboptimal or when the generalization capabilities of the learned models cannot cope with the changes in the environment. Nevertheless, in real co-manipulation and teleoperation tasks, the original demonstrations will often be suboptimal and a learning system must be able to cope with new situations. This paper presents a reinforcement learning algorithm that can be applied to such problems. The proposed algorithm is initialized with a probability distribution of demonstrated trajectories and is based on the concept of relevance functions. We show in this paper how the relevance of trajectory parameters to optimization objectives is connected with the concept of Pearson correlation. First, we demonstrate the efficacy of our algorithm by addressing the assisted teleoperation of an object in a static virtual environment. Afterward, we extend this algorithm to deal with dynamic environments by utilizing Gaussian Process regression. The full framework is applied to make a point particle and a 7-DoF robot arm autonomously adapt their movements to changes in the environment as well as to assist the teleoperation of a 7-DoF robot arm in a dynamic environment.

[1]  Sylvain Calinon,et al.  Supervisory teleoperation with online learning and optimal control , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Pengwen Xiong,et al.  Visual-Haptic Aid Teleoperation Based on 3-D Environment Modeling and Updating , 2016, IEEE Transactions on Industrial Electronics.

[3]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[4]  Stefan Schaal,et al.  STOMP: Stochastic trajectory optimization for motion planning , 2011, 2011 IEEE International Conference on Robotics and Automation.

[5]  Gerhard Neumann,et al.  A learning-based shared control architecture for interactive task execution , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Stefan Schaal,et al.  Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.

[7]  Gerhard Nahler,et al.  Pearson Correlation Coefficient , 2020, Definitions.

[8]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[9]  Sylvain Calinon,et al.  A tutorial on task-parameterized movement learning and retrieval , 2015, Intelligent Service Robotics.

[10]  Sylvain Calinon,et al.  Learning assistive teleoperation behaviors from demonstration , 2016, 2016 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).

[11]  Jan Peters,et al.  Reinforcement Learning of Trajectory Distributions: Applications in Assisted Teleoperation and Motion Planning , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Jan Peters,et al.  Assisting Movement Training and Execution With Visual and Haptic Feedback , 2018, Front. Neurorobot..

[13]  Jan Peters,et al.  Guiding Trajectory Optimization by Demonstrated Distributions , 2017, IEEE Robotics and Automation Letters.

[14]  Byron Boots,et al.  Towards Robust Skill Generalization: Unifying Learning from Demonstration and Motion Planning , 2017, CoRL.

[15]  Siddhartha S. Srinivasa,et al.  CHOMP: Gradient optimization techniques for efficient motion planning , 2009, 2009 IEEE International Conference on Robotics and Automation.

[16]  Stefan Schaal,et al.  Reinforcement learning by reward-weighted regression for operational space control , 2007, ICML '07.

[17]  Freek Stulp,et al.  Co-manipulation with multiple probabilistic virtual guides , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[18]  Jan Peters,et al.  Using probabilistic movement primitives in robotics , 2017, Autonomous Robots.

[19]  Jan Peters,et al.  Demonstration based trajectory optimization for generalizable robot motions , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).