Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions

Following verbal route instructions requires knowledge of language, space, action and perception. We present MARCO, an agent that follows free-form, natural language route instructions by representing and executing a sequence of compound action specifications that model which actions to take under which conditions. MARCO infers implicit actions from knowledge of both linguistic conditional phrases and from spatial action and local configurations. Thus, MARCO performs explicit actions, implicit actions necessary to achieve the stated conditions, and exploratory actions to learn about the world. We gathered a corpus of 786 route instructions from six people in three large-scale virtual indoor environments. Thirtysix other people followed these instructions and rated them for quality. These human participants finished at the intended destination on 69% of the trials. MARCO followed the same instructions in the same environments, with a success rate of 61%. We measured the efficacy of action inference with MARCO variants lacking action inference: executing only explicit actions, MARCO succeeded on just 28% of the trials. For this task, inferring implicit actions is essential to follow poor instructions, but is also crucial for many highly-rated route instructions.

[1]  Christopher Riesbeck,et al.  "You Can't Miss it!": Judging the Clarity of Directions , 1980, Cogn. Sci..

[2]  Eric J. Vanetti,et al.  Communicating Environmental Knowledge , 1988 .

[3]  David Chapman,et al.  What are plans for? , 1990, Robotics Auton. Syst..

[4]  Anne H. Anderson,et al.  The Hcrc Map Task Corpus , 1991 .

[5]  Barbara Di Eugenio,et al.  Understanding Natural Language Instructions: The Case of Purpose Clauses , 1992, ACL.

[6]  Erann Gat,et al.  Experiences with an architecture for intelligent, reactive agents , 1997, J. Exp. Theor. Artif. Intell..

[7]  Christian Freksa,et al.  Spatial Information Theory. Cognitive and Computational Foundations of Geographic Information Science , 1999, Lecture Notes in Computer Science.

[8]  Daniel R. Montello,et al.  Elements of Good Route Directions in Familiar and Unfamiliar Environments , 1999, COSIT.

[9]  Paul U. Lee,et al.  Pictorial and Verbal Tools for Conveying Routes , 1999, COSIT.

[10]  Norman I. Badler,et al.  Dynamically altering agent behaviors using natural language instructions , 2000, AGENTS '00.

[11]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[12]  Guido Bugmann,et al.  Using verbal instructions for route learning: Instruction Analysis , 2001 .

[13]  Srini Narayanan,et al.  Putting Frames in Perspective , 2002, COLING.

[14]  Guido Bugmann,et al.  Corpus-Based Robotics: A Route Instruction Example , 2003 .

[15]  Reid G. Simmons,et al.  GRACE: An Autonomous Robot for the AAAI Robot Challenge , 2003, AI Mag..

[16]  Michel Denis,et al.  Testing the Value of Route Directions Through Navigational Performance , 2003, Spatial Cogn. Comput..

[17]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[18]  Benjamin Kuipers,et al.  Local metrical and global topological maps in the hybrid spatial semantic hierarchy , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[19]  Marjorie Skubic,et al.  Spatial language for human-robot dialogs , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[20]  Paul U. Lee,et al.  Wayfinding choremes - a language for modeling conceptual route knowledge , 2005, J. Vis. Lang. Comput..

[21]  Deb Roy,et al.  Semiotic schemas: A framework for grounding language in action and perception , 2005, Artif. Intell..

[22]  Matt MacMahon,et al.  Human and Automated Indoor Route Instruction Following , 2006 .