Recognizing Textual Entailment: Is Lexical Similarity Enough?

We describe the system we used at the PASCAL-2005 Recognizing Textual Entailment Challenge. Our method for recognizing entailment is based on calculating “directed” sentence similarity: checking the directed “semantic” word overlap between the text and the hypothesis. We use frequency-based term weighting in combination with two different lexical similarity measures. Although one version of the system shows significant improvement over randomly guessing decisions (with an accuracy score of 57.3), we show that this is only due to a subset of the data that can be equally well handled by simple word overlap. Furthermore, we give an in-depth analysis of the system and the data of the challenge.