论文信息 - Monte Carlo semantics : robust inference and logical pattern processing with natural language text

Monte Carlo semantics : robust inference and logical pattern processing with natural language text

This thesis develops several pieces of theory and computational techniques which can be deployed for the purpose of allowing a computer to analyze short pieces of text (e.g. ‘Socrates is a man and every man is mortal.’) and, on the basis of such an analysis, to decide yes/no questions about the text (‘Is Socrates mortal?’). More particularly, the problem is seen as a logical inferencing task. The computer must decide whether or not a logical consequence relation ‘therefore’ holds between the two pieces of text. (‘Socrates is a man and every man is mortal, therefore Socrates is mortal.’) This problem is a pervasive theme in logic and semantics but has also been subject over the last five years to a wave of renewed attention in computational linguistics sparked by the Recognizing Textual Entailment (RTE) challenge. A critical reevaluation of this line of work is presented here which demonstrate several problems concerning the empirical methodology used at RTE and the results derived from it. This thesis is thus more theorydriven, but nevertheless inspired by RTE in that it addresses problems raised by RTE which have not previously received sufficient attention from a theoretical viewpoint, such as the problem of robustness. With this goal in mind, two of the results on Natural Language Reasoning (NLR) established here become particularly important: (1) Assuming the syllogism as a benchmark fragment of NLR, the model theory which underlies NLR is not necessarily a two-valued logic, but it can be the many-valued Łukasiewicz logic. (2) Despite the fact that the syllogism is a logical language of less expressive power than natural language as a whole, a good approximation to NLR can still be obtained by using the method outlined here for rewriting natural language text into syllogistic premises. These two properties of NLR enable the approach to robust inference and logical pattern processing called Monte Carlo semantics, which, in turn, demonstrates that a single logically based theory can account for the semantic informativity of deep techniques using theorem proving and for the robustness of bag-of-words shallow inference.

Richard Bergmair | Richard Bergmair

[1] Yorick Wilks,et al. Natural language inference. , 1973 .

[2] Richard Bergmair,et al. Monte Carlo Semantics: McPIET at RTE4 , 2008, TAC.

[3] van Cj Kees Deemter. The sorites fallacy and the context-dependence of vague predicates , 1996 .

[4] José Luis Vicedo González,et al. TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[5] Joachim Niehren,et al. Bridging the gap between underspecification formalisms: hole semantics as dominance constraints , 2002 .

[6] Nissim Francez,et al. A ‘Natural Logic’ inference system using the Lambek calculus , 2006, J. Log. Lang. Inf..

[7] Johan Bos,et al. Predicate logic unplugged , 1996 .

[8] Joachim Niehren,et al. Minimal Recursion Semantics as Dominance Constraints: Translation, Evaluation, and Analysis , 2004, ACL.

[9] Eric Yeh,et al. Deciding Entailment and Contradiction with Stochastic and Edit Distance-based Alignment , 2008, TAC.

[10] Ido Dagan,et al. Investigating a Generic Paraphrase-Based Approach for Relation Extraction , 2006, EACL.

[11] F. J. Pelletier,et al. Some notes concerning fuzzy logics , 1977 .