Grounding English Commands to Reward Functions

As intelligent robots become more prevalent, methods to make interaction with the robots more accessible are increasingly important. Communicating the tasks that a person wants the robot to carry out via natural language, and training the robot to ground the natural language through demonstration, are especially appealing approaches for interaction, since they do not require a technical background. However, existing approaches map natural language commands to robot command languages that directly express the sequence of actions the robot should execute. This sequence is often specific to a particular situation and does not generalize to new situations. To address this problem, we present a system that grounds natural language commands into reward functions using demonstrations of different natural language commands being carried out in the environment. Because language is grounded to reward functions, rather than explicit actions that the robot can perform, commands can be high-level, carried out in novel environments autonomously, and even transferred to other robots with different action spaces. We demonstrate that our learned model can be both generalized to novel environments and transferred to a robot with a different action space than the action space used during training.

[1]  Sergei Nirenburg,et al.  A Statistical Approach to Machine Translation , 2003 .

[2]  Luke S. Zettlemoyer,et al.  Reading between the Lines: Learning to Map High-Level Instructions to Commands , 2010, ACL.

[3]  Andre Cohen,et al.  An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[4]  Stefanie Tellex,et al.  A natural language planner interface for mobile manipulators , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[5]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[6]  Stefanie Tellex,et al.  Toward understanding natural language directions , 2010, HRI 2010.

[7]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[8]  Luke S. Zettlemoyer,et al.  Learning Context-Dependent Mappings from Sentences to Logical Form , 2009, ACL.

[9]  Raymond J. Mooney,et al.  Learning Synchronous Grammars for Semantic Parsing with Lambda Calculus , 2007, ACL.

[10]  Luke S. Zettlemoyer,et al.  Learning to Parse Natural Language Commands to a Robot Control System , 2012, ISER.

[11]  Matthew R. Walter,et al.  Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[12]  Jonathan Schaeffer,et al.  Sokoban: A Challenging Single-Agent Search Problem , 1997, IJCAI 1997.

[13]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[14]  N. Roy,et al.  Imitation Learning for Natural Language Direction Following , 2011 .

[15]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[16]  Cambridge Ma A Voice-Commandable Robotic Forklift Working Alongside Humans in Minimally-Prepared Outdoor Environments , 2010 .

[17]  Ashutosh Saxena,et al.  Tell me Dave: Context-sensitive grounding of natural language to manipulation instructions , 2014, Int. J. Robotics Res..

[18]  Rohit J. Kate,et al.  Using String-Kernels for Learning Semantic Parsers , 2006, ACL.

[19]  Luke S. Zettlemoyer,et al.  Reinforcement Learning for Mapping Instructions to Actions , 2009, ACL.

[20]  Felix Duvallet,et al.  Imitation learning for natural language direction following through unknown environments , 2013, 2013 IEEE International Conference on Robotics and Automation.

[21]  Michael L. Littman,et al.  Apprenticeship Learning About Multiple Intentions , 2011, ICML.

[22]  Raymond J. Mooney,et al.  Learning to Interpret Natural Language Navigation Instructions from Observations , 2011, Proceedings of the AAAI Conference on Artificial Intelligence.

[23]  Daniel Jurafsky,et al.  Learning to Follow Navigational Directions , 2010, ACL.

[24]  Stefanie Tellex,et al.  Learning perceptually grounded word meanings from unaligned parallel data , 2012, Machine Learning.

[25]  Luke S. Zettlemoyer,et al.  A Joint Model of Language and Perception for Grounded Attribute Learning , 2012, ICML.

[26]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[27]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.