Toward Information Theoretic Human-Robot Dialog

Our goal is to build robots that can robustly interact with humans using natural language. This problem is challenging because human language is filled with ambiguity, and furthermore, due to limitations in sensing, the robot's perception of its environment might be much more limited than that of its human partner. To enable a robot to recover from a failure to understand a natural language utterance, this paper describes an information-theoretic strategy for asking targeted clarifying questions and using information from the answer to disambiguate the language. To identify good questions, we derive an estimate of the robot's uncertainty about the mapping between specific phrases in the language and aspects of the external world. This metric enables the robot to ask a targeted question about the parts of the language for which it is most uncertain. After receiving an answer, the robot fuses information from the command, the question, and the answer in a joint probabilistic graphical model in the G3 framework. When using answers to questions, we show the robot is able to infer mappings between parts of the language and concrete object groundings in the external world with higher accuracy than by using information from the command alone. Furthermore, we demonstrate that by effectively selecting which questions to ask, the robot is able to achieve significant performance gains while asking many fewer questions than baseline metrics.

[1]  Joelle Pineau,et al.  Spoken Dialogue Management Using Probabilistic Reasoning , 2000, ACL.

[2]  Stefanie Tellex,et al.  Toward understanding natural language directions , 2010, HRI 2010.

[3]  Stefanie Tellex,et al.  Object schemas for grounding language in a responsive robot , 2008, Connect. Sci..

[4]  Dieter Fox,et al.  Following directions using statistical machine translation , 2010, HRI 2010.

[5]  A. Wierzbicka,et al.  Semantics and cognition. , 2006, Wiley interdisciplinary reviews. Cognitive science.

[6]  Marjorie Skubic,et al.  Spatial language for human-robot dialogs , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[7]  Claire Cardie,et al.  Reconcile: A Coreference Resolution Research Platform , 2010 .

[8]  Matthew R. Walter,et al.  Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[9]  Daniel Jurafsky,et al.  Learning to Follow Navigational Directions , 2010, ACL.

[10]  Tingting Xu,et al.  The Autonomous City Explorer: Towards Natural Human-Robot Interaction in Urban Environments , 2009, Int. J. Soc. Robotics.

[11]  Dimitar Simeonov,et al.  Toward Interpreting Spatial Language Discourse with Grounding Graphs , 2011 .

[12]  Matthias Scheutz,et al.  What to do and how to do it: Translating natural language directives into temporal and dynamic logic representation for goal management and action execution , 2009, 2009 IEEE International Conference on Robotics and Automation.

[13]  Nicholas Roy,et al.  Spoken language interaction with model uncertainty: an adaptive human–robot interaction system , 2008, Connect. Sci..

[14]  S. Young,et al.  Scaling POMDPs for Spoken Dialog Management , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Stephanie Rosenthal,et al.  Learning Accuracy and Availability of Humans Who Help Mobile Robots , 2011, AAAI.

[16]  Benjamin Kuipers,et al.  Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions , 2006, AAAI.

[17]  Francesco Orilia,et al.  Semantics and Cognition , 1991 .

[18]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[19]  Matthias Scheutz,et al.  Robust spoken instruction understanding for HRI , 2010, HRI 2010.

[20]  Steve J. Young,et al.  USING POMDPS FOR DIALOG MANAGEMENT , 2006, 2006 IEEE Spoken Language Technology Workshop.

[21]  Matthew R. Walter,et al.  Approaching the Symbol Grounding Problem with Probabilistic Graphical Models , 2011, AI Mag..

[22]  Terry Winograd,et al.  Procedures As A Representation For Data In A Computer Program For Understanding Natural Language , 1971 .