Learning to Interpret Natural Language Navigation Instructions from Observations

The ability to understand natural-language instructions is critical to building intelligent agents that interact with humans. We present a system that learns to transform natural-language navigation instructions into executable formal plans. Given no prior linguistic knowledge, the system learns by simply observing how humans follow navigation instructions. The system is evaluated in three complex virtual indoor environments with numerous objects and landmarks. A previously collected realistic corpus of complex English navigation instructions for these environments is used for training and testing data. By using a learned lexicon to refine inferred plans and a supervised learner to induce a semantic parser, the system is able to automatically learn to correctly interpret a reasonable fraction of the complex instructions in this corpus.

[1]  W. Klein,et al.  Speech, place, and action : studies in deixis and related topics , 1982 .

[2]  W. Klein Local deixis in route directions , 1982 .

[3]  Dieter Wunderlich,et al.  How to get there from here , 1982 .

[4]  Anne H. Anderson,et al.  The Hcrc Map Task Corpus , 1991 .

[5]  J. Siskind A computational study of cross-situational techniques for learning word-to-meaning mappings , 1996, Cognition.

[6]  Hayes Mizell,et al.  How To Get There from Here. , 2001 .

[7]  Raymond J. Mooney,et al.  Acquiring Word-Meaning Mappings for Natural Language Interfaces , 2011, J. Artif. Intell. Res..

[8]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[9]  Rohit J. Kate,et al.  Using String-Kernels for Learning Semantic Parsers , 2006, ACL.

[10]  Raymond J. Mooney,et al.  Learning for Semantic Parsing with Statistical Machine Translation , 2006, NAACL.

[11]  Benjamin Kuipers,et al.  Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions , 2006, AAAI.

[12]  Regina Barzilay,et al.  Database-Text Alignment via Structured Multilabel Classification , 2007, IJCAI.

[13]  Rohit J. Kate,et al.  Learning Language Semantics from Ambiguous Supervision , 2007, AAAI.

[14]  Raymond J. Mooney,et al.  Learning to Connect Language and Perception , 2008, AAAI.

[15]  Hwee Tou Ng,et al.  A Generative Model for Parsing Natural Language to Meaning Representations , 2008, EMNLP.

[16]  A. Haas,et al.  Learning to Follow Navigational Route Instructions , 2009, IJCAI.

[17]  Jeffrey Nichols,et al.  Interpreting Written How-To Instructions , 2009, IJCAI.

[18]  Dan Klein,et al.  Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[19]  Luke S. Zettlemoyer,et al.  Reinforcement Learning for Mapping Instructions to Actions , 2009, ACL.

[20]  Raymond J. Mooney,et al.  Generative Alignment and Semantic Parsing for Learning from Ambiguous Supervision , 2010, COLING.

[21]  Raymond J. Mooney,et al.  Training a Multilingual Sportscaster: Using Perceptual Context to Learn Language , 2014, J. Artif. Intell. Res..

[22]  Dieter Fox,et al.  Following directions using statistical machine translation , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[23]  Stefanie Tellex,et al.  Toward understanding natural language directions , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[24]  Jason Weston,et al.  Label Ranking under Ambiguous Supervision for Learning Semantic Correspondences , 2010, ICML.

[25]  Daniel Jurafsky,et al.  Learning to Follow Navigational Directions , 2010, ACL.