论文信息 - Motion Planning Networks: Bridging the Gap Between Learning-Based and Classical Motion Planners

Motion Planning Networks: Bridging the Gap Between Learning-Based and Classical Motion Planners

This paper describes Motion Planning Networks (MPNet), a computationally efficient, learning-based neural planner for solving motion planning problems. MPNet uses neural networks to learn general near-optimal heuristics for path planning in seen and unseen environments. It takes environment information such as raw point-cloud from depth sensors, as well as a robot's initial and desired goal configurations and recursively calls itself to bidirectionally generate connectable paths. In addition to finding directly connectable and near-optimal paths in a single pass, we show that worst-case theoretical guarantees can be proven if we merge this neural network strategy with classical sample-based planners in a hybrid approach while still retaining significant computational and optimality improvements. To train the MPNet models, we present an active continual learning approach that enables MPNet to learn from streaming data and actively ask for expert demonstrations when needed, drastically reducing data for training. We validate MPNet against gold-standard and state-of-the-art planning methods in a variety of problems from 2D to 7D robot configuration spaces in challenging and cluttered environments, with results showing significant and consistently stronger performance metrics, and motivating neural planning in general as a modern strategy for solving motion planning problems efficiently.

[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[2] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[3] J. Schwartz,et al. On the “piano movers” problem. II. General techniques for computing topological properties of real algebraic manifolds , 1983 .

[4] Sergey Levine,et al. Collective robot reinforcement learning with distributed asynchronous guided policy search , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5] Victor Ng-Thow-Hing,et al. Fast smoothing of manipulator trajectories using optimal bounded-acceleration shortcuts , 2010, 2010 IEEE International Conference on Robotics and Automation.

[6] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.

[7] Dmitry Berenson,et al. A robot path planning framework that learns from experience , 2012, 2012 IEEE International Conference on Robotics and Automation.

[8] Marco Pavone,et al. Learning Sampling Distributions for Robot Motion Planning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[9] A WesleyMichael,et al. An algorithm for planning collision-free paths among polyhedral obstacles , 1979 .

[10] Rodney A. Brooks,et al. A subdivision algorithm in configuration space for findpath with rotation , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[11] Oussama Khatib,et al. Real-Time Obstacle Avoidance for Manipulators and Mobile Robots , 1985, Autonomous Robot Vehicles.

[12] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[13] Siddhartha S. Srinivasa,et al. The Provable Virtue of Laziness in Motion Planning , 2017, ICAPS.

[14] Lydia E. Kavraki,et al. Probabilistic Roadmaps for Robot Path Planning , 1998 .

[15] Kris Hauser,et al. Lazy collision checking in asymptotically-optimal motion planning , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[16] Roberto Cipolla,et al. Geometric Loss Functions for Camera Pose Regression with Deep Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Michael C. Yip,et al. Adversarial Imitation via Variational Inverse Reinforcement Learning , 2018, ICLR.

[18] Yasar Ayaz,et al. Potential functions based sampling heuristic for optimal path planning , 2015, Autonomous Robots.

[19] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[20] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.

[21] Michael C. Yip,et al. Neural Path Planning: Fixed Time, Near-Optimal Path Generation via Oracle Imitation , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[23] Yasar Ayaz,et al. Potentially Guided Bidirectionalized RRT* for Fast Optimal Path Planning in Cluttered Environments , 2018, Robotics Auton. Syst..

[24] Steven M. LaValle,et al. Planning algorithms , 2006 .

[25] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[26] Sebastian Scherer,et al. Learning Heuristic Search via Imitation , 2017, CoRL.

[27] Marco Pavone,et al. Fast Marching Trees: A Fast Marching Sampling-Based Method for Optimal Motion Planning in Many Dimensions , 2013, ISRR.

[28] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[29] Jean-Claude Latombe,et al. Robot motion planning , 1970, The Kluwer international series in engineering and computer science.

[30] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[31] Allan Jabri,et al. Universal Planning Networks , 2018, ICML.

[32] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[33] Kei Okada,et al. Experience-based planning with sparse roadmap spanners , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[34] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[35] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[36] Lydia E. Kavraki,et al. The Open Motion Planning Library , 2012, IEEE Robotics & Automation Magazine.

[37] Michael C. Yip,et al. Motion Planning Networks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[38] Steven M. LaValle,et al. RRT-connect: An efficient approach to single-query path planning , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[39] Siddhartha S. Srinivasa,et al. Batch Informed Trees (BIT*): Sampling-based optimal planning via the heuristically guided search of implicit random geometric graphs , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[40] Emilio Frazzoli,et al. Sampling-based algorithms for optimal motion planning , 2011, Int. J. Robotics Res..

[41] Yasar Ayaz,et al. Adaptive Potential guided directional-RRT , 2013, 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[42] David Isele,et al. Selective Experience Replay for Lifelong Learning , 2018, AAAI.

[43] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[44] James J. Kuffner,et al. Adaptive workspace biasing for sampling-based planners , 2008, 2008 IEEE International Conference on Robotics and Automation.

[45] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[46] John Canny,et al. The complexity of robot motion planning , 1988 .

[47] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[48] Allan Jabri,et al. Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control , 2018, ICML.

[49] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[50] Siddhartha S. Srinivasa,et al. Informed RRT*: Optimal sampling-based path planning focused via direct sampling of an admissible ellipsoidal heuristic , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[51] Brahim Chaib-draa,et al. Parametric Exponential Linear Unit for Deep Convolutional Neural Networks , 2016, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[52] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[53] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[54] Sergey Levine,et al. Path integral guided policy search , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[55] S. LaValle. Rapidly-exploring random trees : a new tool for path planning , 1998 .

[56] Yasar Ayaz,et al. Intelligent bidirectional rapidly-exploring random trees for optimal motion planning in complex cluttered environments , 2015, Robotics Auton. Syst..

[57] Byron Boots,et al. Composing Ensembles of Policies with Deep Reinforcement Learning , 2019, ArXiv.

[58] Ron Alterovitz,et al. Demonstration-Guided Motion Planning , 2011, ISRR.

[59] Daniel D. Lee,et al. Learning Implicit Sampling Distributions for Motion Planning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[60] Byron Boots,et al. Composing Task-Agnostic Policies with Deep Reinforcement Learning , 2020, ICLR.

[61] Steven Skiena,et al. Implementing discrete mathematics - combinatorics and graph theory with Mathematica , 1990 .

[62] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[63] Siddhartha Srinivasa,et al. Deep Conditional Generative Models for Heuristic Search on Graphs with Expensive-to-Evaluate Edges , 2018 .

[64] Michael C. Yip,et al. Deeply Informed Neural Sampling for Robot Motion Planning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[65] David Rolnick,et al. Experience Replay for Continual Learning , 2018, NeurIPS.

[66] Marco Pavone,et al. Robot Motion Planning in Learned Latent Spaces , 2018, IEEE Robotics and Automation Letters.