Learning Adaptive Sampling Distributions for Motion Planning by Self-Imitation

Sampling based motion planning algorithms are widely used due to their effectiveness on problems with large state spaces by incremental tree growth in conjunction with uniform, random sampling. The major bottleneck in the performance of such algorithms is the amount of collision checks performed, which in turns depends on the sampling distribution itself. In this work, we present a framework to learn an adaptive, non-stationary sampling distribution which explicitly minimizes the search effort, given by the amount of collision checks performed. Our framework models the sequential nature of the problem by leveraging both the instantaneous search tree over the robot configuration space, as well as the workspace environment, by encoding them with a conditional variational auto-encoder, to learn a stochastic sampling policy. We encode the workspace environment with a convolutional network, and the configuration space planning tree with a recurrent neural network. We introduce an approximate oracle which can return multiple label samples for a partially solved planning problem, by forward simulating it. We use an imitation via iterative supervised learning framework to learn a stochastic sampling policy. We call this self-supervised imitation of an oracle generated by forward simulation as self-imitation. We validate our approach on a 4D kinodynamic helicopter planning problem with glideslope and curvature constraints, and a 2D holonomic problem.

[1]  Sanjiban Choudhury,et al.  Adaptive Motion Planning , 2018 .

[2]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[3]  Steven M. LaValle,et al.  RRT-connect: An efficient approach to single-query path planning , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[4]  Marco Pavone,et al.  Fast marching tree: A fast marching sampling-based method for optimal motion planning in many dimensions , 2013, ISRR.

[5]  Gireeja Ranade,et al.  Data-driven planning via imitation learning , 2017, Int. J. Robotics Res..

[6]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[7]  S. LaValle Rapidly-exploring random trees : a new tool for path planning , 1998 .

[8]  Siddhartha S. Srinivasa,et al.  Batch Informed Trees (BIT*): Sampling-based optimal planning via the heuristically guided search of implicit random geometric graphs , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[10]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[11]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[12]  Milan Simic,et al.  Sampling-Based Robot Motion Planning: A Review , 2014, IEEE Access.

[13]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[14]  J. Andrew Bagnell,et al.  Reinforcement and Imitation Learning via Interactive No-Regret Learning , 2014, ArXiv.

[15]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[16]  Jonathan Masci,et al.  Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Marco Pavone,et al.  Learning Sampling Distributions for Robot Motion Planning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Jure Leskovec,et al.  GraphRNN: A Deep Generative Model for Graphs , 2018, ICML 2018.

[19]  Sebastian Scherer,et al.  Learning Heuristic Search via Imitation , 2017, CoRL.

[20]  Sebastian Scherer,et al.  The Planner Ensemble and Trajectory Executive: A High Performance Motion Planning System with Guaranteed Safety , 2014 .

[21]  Marco Pavone,et al.  Toward a real-time framework for solving the kinodynamic motion planning problem , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[23]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[24]  James J. Kuffner,et al.  Adaptive workspace biasing for sampling-based planners , 2008, 2008 IEEE International Conference on Robotics and Automation.