Deep compositional robotic planners that follow natural language commands

We demonstrate how a sampling-based robotic planner can be augmented to learn to understand a sequence of natural language commands in a continuous configuration space to move and manipulate objects. Our approach combines a deep network structured according to the parse of a complex command that includes objects, verbs, spatial relations, and attributes, with a sampling-based planner, RRT. A recurrent hierarchical deep network controls how the planner explores the environment, determines when a planned path is likely to achieve a goal, and estimates the confidence of each move to trade off exploitation and exploration between the network and the planner. Planners are designed to have near-optimal behavior when information about the task is missing, while networks learn to exploit observations which are available from the environment, making the two naturally complementary. Combining the two enables generalization to new maps, new kinds of obstacles, and more complex sentences that do not occur in the training set. Little data is required to train the model despite it jointly acquiring a CNN that extracts features from the environment as it learns the meanings of words. The model provides a level of interpretability through the use of attention maps allowing users to see its reasoning steps despite being an end-to-end model. This end-to-end model allows robots to learn to follow natural language commands in challenging continuous environments.

[1]  Daniel Marcu,et al.  Natural Language Communication with Robots , 2016, NAACL.

[2]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[3]  Dieter Fox,et al.  Prospection: Interpretable plans from language by predicting the future , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[4]  Rajeev Motwani,et al.  Path planning in expansive configuration spaces , 1997, Proceedings of International Conference on Robotics and Automation.

[5]  Leslie Pack Kaelbling,et al.  FFRob: Leveraging symbolic planning for efficient task and motion planning , 2016, Int. J. Robotics Res..

[6]  Le Song,et al.  Learning to Plan via Neural Exploration-Exploitation Trees , 2019, ArXiv.

[7]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[8]  S. LaValle Rapidly-exploring random trees : a new tool for path planning , 1998 .

[9]  Leslie Pack Kaelbling,et al.  Integrated task and motion planning in belief space , 2013, Int. J. Robotics Res..

[10]  Luke S. Zettlemoyer,et al.  Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions , 2013, TACL.

[11]  Matthew R. Walter,et al.  Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[12]  Eric P. Xing,et al.  Gated Path Planning Networks , 2018, ICML.

[13]  Swarat Chaudhuri,et al.  The Task-Motion Kit: An Open Source, General-Purpose Task and Motion-Planning Framework , 2018, IEEE Robotics & Automation Magazine.

[14]  Manfred Eppe,et al.  From semantics to execution: Integrating action planning with reinforcement learning for robotic tool use , 2019, ArXiv.

[15]  Boris Katz,et al.  Grounding language acquisition by training semantic parsers using captioned videos , 2018, EMNLP.

[16]  Andrew Bennett,et al.  Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction , 2018, EMNLP.

[17]  Nicholas Roy,et al.  Temporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context , 2017, IJCAI.

[18]  Boris Katz,et al.  Deep Sequential Models for Sampling-Based Planning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Stefanie Tellex,et al.  Learning to Parse Natural Language to Grounded Reward Functions with Weak Supervision , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[21]  Ross A. Knepper,et al.  Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation Prediction , 2018, CoRL.

[22]  Dilek Z. Hakkani-Tür,et al.  FollowNet: Robot Navigation by Following Natural Language Directions with Deep Reinforcement Learning , 2018, ArXiv.

[23]  Sergey Levine,et al.  From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following , 2019, ICLR.

[24]  Emilio Frazzoli,et al.  Sampling-based algorithms for optimal motion planning , 2011, Int. J. Robotics Res..

[25]  Stefan Wermter,et al.  From Semantics to Execution: Integrating Action Planning With Reinforcement Learning for Robotic Causal Problem-Solving , 2019, Front. Robot. AI.

[26]  Maria Fox,et al.  PDDL2.1: An Extension to PDDL for Expressing Temporal Planning Domains , 2003, J. Artif. Intell. Res..