Fast Online Lexicon Learning for Grounded Language Acquisition

Learning a semantic lexicon is often an important first step in building a system that learns to interpret the meaning of natural language. It is especially important in language grounding where the training data usually consist of language paired with an ambiguous perceptual context. Recent work by Chen and Mooney (2011) introduced a lexicon learning method that deals with ambiguous relational data by taking intersections of graphs. While the algorithm produced good lexicons for the task of learning to interpret navigation instructions, it only works in batch settings and does not scale well to large datasets. In this paper we introduce a new online algorithm that is an order of magnitude faster and surpasses the state-of-the-art results. We show that by changing the grammar of the formal meaning representation language and training on additional data collected from Amazon's Mechanical Turk we can further improve the results. We also include experimental results on a Chinese translation of the training data to demonstrate the generality of our approach.

[1]  Mark Johnson,et al.  Reducing Grounded Learning Tasks To Grammatical Inference , 2011, EMNLP.

[2]  Christopher D. Manning,et al.  Optimizing Chinese Word Segmentation for Machine Translation Performance , 2008, WMT@ACL.

[3]  Afsaneh Fazly,et al.  A Probabilistic Computational Model of Cross-Situational Word Learning , 2010, Cogn. Sci..

[4]  Stefanie Tellex,et al.  Toward understanding natural language directions , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[5]  Benjamin Kuipers,et al.  Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions , 2006, AAAI.

[6]  J. Siskind A computational study of cross-situational techniques for learning word-to-meaning mappings , 1996, Cognition.

[7]  Hwee Tou Ng,et al.  A Generative Model for Parsing Natural Language to Meaning Representations , 2008, EMNLP.

[8]  Rohit J. Kate,et al.  Using String-Kernels for Learning Semantic Parsers , 2006, ACL.

[9]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[10]  Luke S. Zettlemoyer,et al.  Online Learning of Relaxed CCG Grammars for Parsing to Logical Form , 2007, EMNLP.

[11]  Raymond J. Mooney,et al.  Learning to Interpret Natural Language Navigation Instructions from Observations , 2011, Proceedings of the AAAI Conference on Artificial Intelligence.

[12]  Dan Klein,et al.  Learning Dependency-Based Compositional Semantics , 2011, CL.

[13]  Daniel Jurafsky,et al.  Learning to Follow Navigational Directions , 2010, ACL.

[14]  Raymond J. Mooney,et al.  Learning for Semantic Parsing , 2009, CICLing.

[15]  Dan Roth,et al.  Confidence Driven Unsupervised Semantic Parsing , 2011, ACL.

[16]  Benjamin Kuipers,et al.  Following natural language route instructions , 2007 .

[17]  Mark Steedman,et al.  Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification , 2010, EMNLP.

[18]  Dan Klein,et al.  Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[19]  Rohit J. Kate Transforming Meaning Representation Grammars to Improve Semantic Parsing , 2008, CoNLL.

[20]  Luke S. Zettlemoyer,et al.  Bootstrapping Semantic Parsers from Conversations , 2011, EMNLP.

[21]  Raymond J. Mooney,et al.  Acquiring Word-Meaning Mappings for Natural Language Interfaces , 2011, J. Artif. Intell. Res..

[22]  Raymond J. Mooney,et al.  Generative Alignment and Semantic Parsing for Learning from Ambiguous Supervision , 2010, COLING.

[23]  Dieter Fox,et al.  Following directions using statistical machine translation , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[24]  Raymond J. Mooney,et al.  Learning to sportscast: a test of grounded language acquisition , 2008, ICML '08.

[25]  William B. Dolan,et al.  Collecting Highly Parallel Data for Paraphrase Evaluation , 2011, ACL.

[26]  Ming-Wei Chang,et al.  Driving Semantic Parsing from the World’s Response , 2010, CoNLL.

[27]  Jason Weston,et al.  Label Ranking under Ambiguous Supervision for Learning Semantic Correspondences , 2010, ICML.