Supervised Recognition of Entailment Between Patterns

In this paper we present a supervised recognition method for entailment between binary lexicosyntactic patterns such as X is the capital of Y and X is in Y. Recognizing entailment relations between patterns is useful for applications such as question answering, which is our main motivation in this work. Since sentences entailing each other are natural paraphrases, entailment is closely related to paraphrasing. Many researchers have successfully used unsupervised distributional similarity based methods for paraphrase acquisition [4, 6, 1], and our own experience with NICT’s spoken question answering system Ikkyu [7] 1 confirms their effectiveness. If Ikkyu could also detect that X is the capital of Y entails X is in Y, it would be able to answer the question “Where is Paris?” from the information that “Paris is the capital of France”. However, X is the capital of Y and X is in Y are not strict paraphrases, and indeed their distributional profiles exhibit large differences. Ikkyu’s current paraphrasing engine is based on distributional similarity between patterns, and so is highly sensitive to such differences. This is the reason Ikkyu currently cannot exploit the information that “Paris is the capital of France” to answer the question “Where is Paris?”. By adding an accurate and robust entailment recognition module that can recognize entailment pairs even with large differences in distributional profile, we aim to further improve Ikkyu’s recall. In this work we explore a supervised method for entailment recognition that uses both distributional similarities and surface/syntactic features. We show that this supervised approach yields better performance than state-of-the-art unsupervised methods, like DIRT [4] or the scoring method from [2], and than supervised methods that only consider surface similarity like [5] for all types of pattern pairs, even those with very low surface similarity (i.e. sharing no content words). Our approach is targeted at Japanese but is easily applicable to other languages. We present in Section 2 a description of the resources and the features used, and in Section 3 our experimental methodology and a discussion of our results.