Grounding Robot Plans from Natural Language Instructions with Incomplete World Knowledge

Our goal is to enable robots to interpret and execute high-level tasks conveyed using natural language instructions. For example, consider tasking a household robot to, “prepare my breakfast”, “clear the boxes on the table” or “make me a fruit milkshake”. Interpreting such underspecified instructions requires environmental context and background knowledge about how to accomplish complex tasks. Further, the robot’s workspace knowledge may be incomplete: the environment may only be partially-observed or background knowledge may be missing causing a failure in plan synthesis. We introduce a probabilistic model that utilizes background knowledge to infer latent or missing plan constituents based on semantic co-associations learned from noisy textual corpora of task descriptions. The ability to infer missing plan constituents enables information-seeking actions such as visual exploration or dialogue with the human to acquire new knowledge to fill incomplete plans. Results indicate robust plan inference from under-specified instructions in partially-known worlds.

[1]  Nicholas Roy,et al.  Temporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context , 2017, IJCAI.

[2]  Luke S. Zettlemoyer,et al.  Learning to Parse Natural Language Commands to a Robot Control System , 2012, ISER.

[3]  Charles J. Fillmore,et al.  The Structure of the Framenet Database , 2003 .

[4]  John R. Anderson ACT: A simple theory of complex cognition. , 1996 .

[5]  Hadas Kress-Gazit,et al.  A model for verifiable grounding and execution of complex natural language instructions , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  Paul Maier,et al.  Markov Logic as a Modelling Language for Weighted Constraint Satisfaction Problems , 2009 .

[7]  Michael Beetz,et al.  Everything robots always wanted to know about housework (but were afraid to ask) , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Hadas Kress-Gazit,et al.  LTLMoP: Experimenting with language, Temporal Logic and robot control , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[10]  Allen Newell,et al.  SOAR: An Architecture for General Intelligence , 1987, Artif. Intell..

[11]  Matthias Scheutz,et al.  Coordination in human-robot teams using mental modeling and plan recognition , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Ming-Wei Chang,et al.  Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base , 2015, ACL.

[13]  Matthias Scheutz,et al.  What to do and how to do it: Translating natural language directives into temporal and dynamic logic representation for goal management and action execution , 2009, 2009 IEEE International Conference on Robotics and Automation.

[14]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[15]  Stefanie Tellex,et al.  A natural language planner interface for mobile manipulators , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Matthew R. Walter,et al.  Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[19]  Matthew R. Walter,et al.  Learning Semantic Maps from Natural Language Descriptions , 2013, Robotics: Science and Systems.

[20]  John E. Laird,et al.  A Standard Model of the Mind: Toward a Common Computational Framework across Artificial Intelligence, Cognitive Science, Neuroscience, and Robotics , 2017, AI Mag..

[21]  Hadas Kress-Gazit,et al.  Translating Structured English to Robot Controllers , 2008, Adv. Robotics.

[22]  Song-Chun Zhu,et al.  Jointly Learning Grounded Task Structures from Language Instruction and Visual Demonstration , 2016, EMNLP.

[23]  Michael Beetz,et al.  Controlled Natural Languages for language generation in artificial cognition , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[25]  Stefanie Tellex,et al.  Modeling and Solving Human-Robot Collaborative Tasks Using POMDPs , 2010 .

[26]  Vivian Chu,et al.  Situated Bayesian Reasoning Framework for Robots Operating in Diverse Everyday Environments , 2017, ISRR.

[27]  Catherine Havasi,et al.  ConceptNet 3 : a Flexible , Multilingual Semantic Network for Common Sense Knowledge , 2007 .

[28]  Michael Beetz,et al.  Equipping robot control programs with first-order probabilistic reasoning capabilities , 2009, 2009 IEEE International Conference on Robotics and Automation.

[29]  Peter Stone,et al.  Opportunistic Active Learning for Grounding Natural Language Descriptions , 2017, CoRL.