Asymptotically Optimal Motion Planning for Tasks Using Learned Virtual Landmarks

Utilizing appropriate landmarks in the environment is often critical to planning a robot's motion for a given task. We propose a method to automatically learn task-relevant landmarks, and incorporate the method into an asymptotically optimal motion planner that is informed by a set of human-guided demonstrations. Our method learns from kinesthetic demonstrations a task model that is parameterized by the poses of virtual landmarks. The approach models a task using multivariate Gaussian distributions in a feature space that includes the robot's configurations and the relative positions of landmarks in the environment. The method automatically learns virtual landmarks that are based on linear combinations or projections of sensed landmarks whose pose is identified using the robot's kinematic model and vision sensors. To compute motion plans for the task in new environments, we parameterize the learned task model using the virtual landmark poses and compute paths that maximally adhere to the learned task model while avoiding obstacles. We experimentally evaluate our approach on two manipulation tasks using the Baxter robot in an environment with obstacles.

[1]  Siddhartha S. Srinivasa,et al.  Addressing cost-space chasms in manipulation planning , 2011, 2011 IEEE International Conference on Robotics and Automation.

[2]  Lydia E. Kavraki,et al.  Anytime solution optimization for sampling-based motion planning , 2013, 2013 IEEE International Conference on Robotics and Automation.

[3]  Ron Alterovitz,et al.  Asymptotically Optimal Motion Planning for Learned Tasks Using Time-Dependent Cost Maps , 2015, IEEE Transactions on Automation Science and Engineering.

[4]  Ron Alterovitz,et al.  Closed-loop global motion planning for reactive execution of learned tasks , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Jürgen Leitner,et al.  Task-relevant roadmaps: A framework for humanoid motion planning , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Kostas E. Bekris,et al.  A study on the finite-time near-optimality properties of sampling-based motion planners , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  James J. Kuffner,et al.  Multipartite RRTs for Rapid Replanning in Dynamic Environments , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[8]  Katsushi Ikeuchi,et al.  A sensor fusion approach for recognizing continuous human grasping sequences using hidden Markov models , 2005, IEEE Transactions on Robotics.

[9]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[10]  Pieter Abbeel,et al.  Superhuman performance of surgical tasks by robots using iterative learning from human-guided demonstrations , 2010, 2010 IEEE International Conference on Robotics and Automation.

[11]  Howie Choset,et al.  Principles of Robot Motion: Theory, Algorithms, and Implementation ERRATA!!!! 1 , 2007 .

[12]  Kostas E. Bekris,et al.  Asymptotically Near-Optimal Planning With Probabilistic Roadmap Spanners , 2013, IEEE Transactions on Robotics.

[13]  Henk Nijmeijer,et al.  Robot Programming by Demonstration , 2010, SIMPAR.

[14]  Darwin G. Caldwell,et al.  A task-parameterized probabilistic model with minimal intervention control , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Emilio Frazzoli,et al.  Sampling-based algorithms for optimal motion planning , 2011, Int. J. Robotics Res..

[16]  Darwin G. Caldwell,et al.  Handling of multiple constraints and motion alternatives in a robot programming by demonstration framework , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.

[17]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[18]  Ron Alterovitz,et al.  Demonstration-Guided Motion Planning , 2011, ISRR.

[19]  Aude Billard,et al.  A dynamical system approach to realtime obstacle avoidance , 2012, Autonomous Robots.

[20]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[21]  Wolfram Burgard,et al.  Imitation learning with generalized task descriptions , 2009, 2009 IEEE International Conference on Robotics and Automation.

[22]  Stefan Schaal,et al.  Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[23]  Timothy Bretl,et al.  Maximum entropy inverse reinforcement learning in continuous state spaces with path integrals , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Jochen J. Steil,et al.  Interactive imitation learning of object movement skills , 2011, Autonomous Robots.

[25]  James O. Berger,et al.  Objective Bayesian Analysis for the Multivariate Normal Model , 2006 .

[26]  Dana Kulic,et al.  Incremental Learning, Clustering and Hierarchy Formation of Whole Body Motion Patterns using Adaptive Hidden Markov Chains , 2008, Int. J. Robotics Res..

[27]  Jan Peters,et al.  Relative Entropy Inverse Reinforcement Learning , 2011, AISTATS.

[28]  Svetha Venkatesh,et al.  Learning and detecting activities from movement trajectories using the hierarchical hidden Markov model , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[30]  Prabhakar Ragde,et al.  A bidirectional shortest-path algorithm with good average-case behavior , 1989, Algorithmica.

[31]  Marc Toussaint,et al.  Task Space Retrieval Using Inverse Feedback Control , 2011, ICML.

[32]  Nidhi Kalra,et al.  Replanning with RRTs , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..