A WordNet-based semantic approach to textual entailment and cross-lingual textual entailment

In this paper we explain how to build a recognizing textual entailment (RTE) system which only uses semantic similarity measures based on WordNet. We show how the widely used WordNet-based semantic measures can be generalized to build sentence level semantic metrics in order to be used in both mono-lingual and cross-lingual textual entailment. We experiment with a wide variety of RTE datasets and evaluate the contribution of an algorithm which expands the RTE monolingual corpus. Results achieved with this method yielded significant statistical differences when predicting RTE test sets. We provide an efficiency analysis of these metrics drawing some conclusions about their practical utility in recognizing textual entailment. We also analyze the cross-lingual textual entailment task, we create a bilingual English–Spanish corpus, and propose a procedure to create a cross-lingual textual entailment corpus for any pair of languages. Finally, we show that the proposed method is enough to build an average score RTE system in both monolingual and cross-lingual textual entailment, that uses semantic information from WordNet as the only source of lexical-semantic knowledge.

[1]  Peter Clark,et al.  The Seventh PASCAL Recognizing Textual Entailment Challenge , 2011, TAC.

[2]  Matteo Negri,et al.  Creating a Bi-lingual Entailment Corpus through Translations with Mechanical Turk: $100 for a 10-day Rush , 2010, Mturk@HLT-NAACL.

[3]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[4]  M. Felisa Verdejo,et al.  Textual Entailment Recognition Based on Dependency Analysis and WordNet , 2005, MLCW.

[5]  Julio J. Castillo,et al.  Using Sentence Semantic Similarity Based on WordNet in Recognizing Textual Entailment , 2010, IBERAMIA.

[6]  Ludmila I. Kuncheva,et al.  Full-class set classification using the Hungarian algorithm , 2010, Int. J. Mach. Learn. Cybern..

[7]  David McLean,et al.  An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources , 2003, IEEE Trans. Knowl. Data Eng..

[8]  Alexander F. Gelbukh,et al.  On Some Optimization Heuristics for Lesk-Like WSD Algorithms , 2005, NLDB.

[9]  John Yearwood,et al.  From Lexical Entailment to Recognizing Textual Entailment Using Linguistic Resources , 2009, ALTA.

[10]  Marcello Federico,et al.  Towards Cross-Lingual Textual Entailment , 2010, NAACL.

[11]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[12]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[13]  Jeanine Lilleng,et al.  Cross-Lingual Information Retrieval by Feature Vectors , 2007, NLDB.

[14]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[15]  Zuhair Bandar,et al.  Sentence similarity based on semantic nets and corpus statistics , 2006, IEEE Transactions on Knowledge and Data Engineering.

[16]  Nuno Seco,et al.  Design, Implementation and Evaluation of a New Semantic Similarity Metric Combining Features and Intrinsic Information Content , 2008, OTM Conferences.

[17]  Julio Castillo A Machine Learning Approach for Recognizing Textual Entailment in Spanish , 2010, NAACL.

[18]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[19]  Christiane Fellbaum,et al.  Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[20]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[21]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[22]  Ted Pedersen,et al.  Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts , 2006 .

[23]  Julio J. Castillo A Semantic Oriented Approach to Textual Entailment Using WordNet-Based Measures , 2010, MICAI.

[24]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[25]  Julio J. Castillo Using Machine Translation Systems to Expand a Corpus in Textual Entailment , 2010, IceTAL.

[26]  Jennifer Marlow,et al.  Exploring the Effects of Language Skills on Multilingual Web Search , 2008, ECIR.

[27]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[28]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.