Deterministic Statistical Mapping of Sentences to Underspecified Semantics

We present a method for training a statistical model for mapping natural language sentences to semantic expressions. The semantics are expressions of an underspecified logical form that has properties making it particularly suitable for statistical mapping from text. An encoding of the semantic expressions into dependency trees with automatically generated labels allows application of existing methods for statistical dependency parsing to the mapping task (without the need for separate traditional dependency labels or parts of speech). The encoding also results in a natural per-word semantic-mapping accuracy measure. We report on the results of training and testing statistical models for mapping sentences of the Penn Treebank into the semantic expressions, for which per-word semantic mapping accuracy ranges between 79% and 86% depending on the experimental conditions. The particular choice of algorithms used also means that our trained mapping is deterministic (in the sense of deterministic parsing), paving the way for large-scale text-to-semantic mapping.

[1]  Richard Montague,et al.  The Proper Treatment of Quantification in Ordinary English , 1973 .

[2]  R. Montague Formal philosophy; selected papers of Richard Montague , 1974 .

[3]  C. Pollard,et al.  Center for the Study of Language and Information , 2022 .

[4]  J. Benthem Essays in Logical Semantics , 1986 .

[5]  Stuart M. Shieber,et al.  Prolog and Natural-Language Analysis , 1987 .

[6]  Patrick Saint-Dizier,et al.  Review of Prolog and natural-language analysis: CSLI lecture notes 10 by Fernando C. N. Pereira and Stuart M. Shieber. Center for the Study of Language and Information 1987. , 1988 .

[7]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[8]  Hiyan Alshawi,et al.  Monotonic Semantic Interpretation , 1992, ACL.

[9]  H. Alshawi,et al.  The Core Language Engine , 1994 .

[10]  J.F.A.K. van Benthem,et al.  Language in Action: Categories, Lambdas and Dynamic Logic , 1997 .

[11]  Kees van Deemter,et al.  Semantic ambiguity and underspecification , 1996 .

[12]  Carl Vogel,et al.  Proceedings of the 16th International Conference on Computational Linguistics , 1996, COLING 1996.

[13]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[14]  Dan Flickinger,et al.  Minimal Recursion Semantics: An Introduction , 2005 .

[15]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[16]  David L. Davidson,et al.  The Logical Form of Action Sentences , 2001 .

[17]  Maria Liakata,et al.  From Trees to Predicate-argument Structures , 2002, COLING.

[18]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[19]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[20]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[21]  Mark Steedman,et al.  Wide-Coverage Semantic Representations from a CCG Parser , 2004, COLING.

[22]  Mark E. Stickel,et al.  Automated deduction by theory resolution , 1985, Journal of Automated Reasoning.

[23]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[24]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[25]  J. V. Benthem Johan van Benthem , 2008 .

[26]  Xavier Carreras,et al.  Simple Semi-supervised Dependency Parsing , 2008, ACL.

[27]  Mirella Lapata,et al.  Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, 6-7 August 2009, Singapore, A meeting of SIGDAT, a Special Interest Group of the ACL , 2009, EMNLP.

[28]  Chih-Jen Lin,et al.  Training and Testing Low-degree Polynomial Data Mappings via Linear SVM , 2010, J. Mach. Learn. Res..

[29]  Hoifung Poon,et al.  Unsupervised Semantic Parsing , 2009, EMNLP.