Differentiable Spatial Planning using Transformers

We consider the problem of spatial path planning. In contrast to the classical solutions which optimize a new plan from scratch and assume access to the full map with ground truth obstacle locations, we learn a planner from the data in a differentiable manner that allows us to leverage statistical regularities from past data. We propose Spatial Planning Transformers (SPT), which given an obstacle map learns to generate actions by planning over long-range spatial dependencies, unlike prior data-driven planners that propagate information locally via convolutional structure in an iterative manner. In the setting where the ground truth map is not known to the agent, we leverage pre-trained SPTs in an end-to-end framework that has the structure of mapper and planner built into it which allows seamless generalization to out-of-distribution maps and goals. SPTs outperform prior state-of-the-art differentiable planners across all the setups for both manipulation and navigation tasks, leading to an absolute improvement of 7-19%.

[1]  Allan Jabri,et al.  Universal Planning Networks , 2018, ICML.

[2]  Rahul Sukthankar,et al.  Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.

[3]  B. Faverjon,et al.  Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .

[4]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[5]  David Hsu,et al.  QMDP-Net: Deep Learning for Planning under Partial Observability , 2017, NIPS.

[6]  Anelia Angelova,et al.  Differentiable Mapping Networks: Learning Structured Map Representations for Sparse Visual Localization , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Ruslan Salakhutdinov,et al.  Learning to Explore using Active Neural SLAM , 2020, ICLR.

[8]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[9]  Jitendra Malik,et al.  Gibson Env: Real-World Perception for Embodied Agents , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Jitendra Malik,et al.  Zero-Shot Visual Imitation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[11]  Jitendra Malik,et al.  Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Ruslan Salakhutdinov,et al.  Neural Topological SLAM for Visual Navigation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Max Welling,et al.  Neural Enhanced Belief Propagation on Factor Graphs , 2020, AISTATS.

[14]  Tomás Lozano-Pérez,et al.  Spatial Planning: A Configuration Space Approach , 1983, IEEE Transactions on Computers.

[15]  Viii Supervisor Sonar-Based Real-World Mapping and Navigation , 2001 .

[16]  Michael C. Yip,et al.  Motion Planning Networks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[17]  Steven M. LaValle,et al.  Randomized Kinodynamic Planning , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[18]  Daniel Kappler,et al.  Riemannian Motion Policies , 2018, ArXiv.

[19]  Vladlen Koltun,et al.  Benchmarking Classic and Learned Navigation in Complex 3D Environments , 2019, ArXiv.

[20]  Ruben Villegas,et al.  Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[21]  Pieter Abbeel,et al.  Sparse Graphical Memory for Robust Planning , 2020, NeurIPS.

[22]  E. Tolman Cognitive maps in rats and men. , 1948, Psychological review.

[23]  Satinder Singh,et al.  Value Prediction Network , 2017, NIPS.

[24]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[25]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[26]  Aaron van den Oord,et al.  Shaping Belief States with Generative Environment Models for RL , 2019, NeurIPS.

[27]  Vincent Lepetit,et al.  View-based Maps , 2010, Int. J. Robotics Res..

[28]  Pushmeet Kohli,et al.  Value Propagation Networks , 2018, ICLR.

[29]  Takayuki Okatani,et al.  Revisiting Single Image Depth Estimation: Toward Higher Resolution Maps With Accurate Object Boundaries , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[30]  Le Song,et al.  Learning to Plan in High Dimensions via Neural Exploration-Exploitation Trees , 2020, ICLR.

[31]  Shuai Yi,et al.  Efficient Attention: Attention with Linear Complexities , 2018, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[32]  Hugh F. Durrant-Whyte,et al.  Simultaneous localization and mapping: part I , 2006, IEEE Robotics & Automation Magazine.

[33]  D. Stea Cognitive Maps in Rats and Men , 2017 .

[34]  Ruslan Salakhutdinov,et al.  Object Goal Navigation using Goal-Oriented Semantic Exploration , 2020, NeurIPS.

[35]  Rémi Munos,et al.  Learning to Search with MCTSnets , 2018, ICML.

[36]  P. Schrimpf,et al.  Dynamic Programming , 2011 .

[37]  Byron Boots,et al.  Differentiable MPC for End-to-end Planning and Control , 2018, NeurIPS.

[38]  Emilio Frazzoli,et al.  Sampling-based algorithms for optimal motion planning , 2011, Int. J. Robotics Res..

[39]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[40]  José Ruíz Ascencio,et al.  Visual simultaneous localization and mapping: a survey , 2012, Artificial Intelligence Review.

[41]  John Canny,et al.  The complexity of robot motion planning , 1988 .

[42]  Razvan Pascanu,et al.  Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.

[43]  Lukasz Kaiser,et al.  Reformer: The Efficient Transformer , 2020, ICLR.

[44]  Pieter Abbeel,et al.  Value Iteration Networks , 2016, NIPS.

[45]  Sven Behnke,et al.  Value Iteration Networks on Multiple Levels of Abstraction , 2019, Robotics: Science and Systems.

[46]  Oussama Khatib,et al.  Real-Time Obstacle Avoidance for Manipulators and Mobile Robots , 1985, Autonomous Robot Vehicles.

[47]  Vijay Kumar,et al.  Memory Augmented Control Networks , 2017, ICLR.

[48]  Vladlen Koltun,et al.  Semi-parametric Topological Memory for Navigation , 2018, ICLR.

[49]  Ali Farhadi,et al.  Visual Semantic Planning Using Deep Successor Representations , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Wolfram Burgard,et al.  Neural SLAM: Learning to Explore with External Memory , 2017, 1706.09520.

[51]  Marco Pavone,et al.  Learning Sampling Distributions for Robot Motion Planning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[52]  宋金平,et al.  美国地理学百年发展脉络分析―基于《Annals of the Association of American Geographers》学术论文的统计分析 , 2007 .

[53]  Alberto Elfes,et al.  Using occupancy grids for mobile robot perception and navigation , 1989, Computer.

[54]  Shimon Whiteson,et al.  TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning , 2017, ICLR.

[55]  Tom Schaul,et al.  The Predictron: End-To-End Learning and Planning , 2016, ICML.

[56]  Tobias Glasmachers,et al.  Limits of End-to-End Learning , 2017, ACML.

[57]  Byron Boots,et al.  Differentiable Gaussian Process Motion Planning , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[58]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[59]  Eric P. Xing,et al.  Gated Path Planning Networks , 2018, ICML.

[60]  P.J. Werbos,et al.  Efficient Learning in Cellular Simultaneous Recurrent Neural Networks - The Case of Maze Navigation Problem , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[61]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[62]  Michael C. Yip,et al.  Neural Manipulation Planning on Constraint Manifolds , 2020, IEEE Robotics and Automation Letters.

[63]  Alejandro Ribeiro,et al.  Graph Neural Networks for Motion Planning , 2020, ArXiv.

[64]  Marc Pollefeys,et al.  Vision-based autonomous mapping and exploration using a quadrotor MAV , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[65]  Ruslan Salakhutdinov,et al.  Neural Map: Structured Memory for Deep Reinforcement Learning , 2017, ICLR.

[66]  Pieter Abbeel,et al.  Planning to Explore via Self-Supervised World Models , 2020, ICML.

[67]  Sergey Levine,et al.  Search on the Replay Buffer: Bridging Planning and Reinforcement Learning , 2019, NeurIPS.