UAlacant: Using Online Machine Translation for Cross-Lingual Textual Entailment

This paper describes a new method for cross-lingual textual entailment (CLTE) detection based on machine translation (MT). We use sub-segment translations from different MT systems available online as a source of cross-lingual knowledge. In this work we describe and evaluate different features derived from these sub-segment translations, which are used by a support vector machine classifier to detect CLTEs. We presented this system to the SemEval 2012 task 8 obtaining an accuracy up to 59.8% on the English-Spanish test set, the second best performing approach in the contest.

[1]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[2]  Lucia Specia,et al.  Source-Language Entailment Modeling for Translating Unknown Terms , 2009, ACL.

[3]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[4]  Marcello Federico,et al.  Towards Cross-Lingual Textual Entailment , 2010, NAACL.

[5]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[6]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[7]  Sergios Theodoridis,et al.  Pattern Recognition, Fourth Edition , 2008 .

[8]  J Quinonero Candela,et al.  Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment , 2006, Lecture Notes in Computer Science.

[9]  Julio J. Castillo A WordNet-based semantic approach to textual entailment and cross-lingual textual entailment , 2011, Int. J. Mach. Learn. Cybern..

[10]  Daniel Jurafsky,et al.  Robust Machine Translation Evaluation with Entailment Features , 2009, ACL.

[11]  Matteo Negri,et al.  Divide and Conquer: Crowdsourcing the Creation of Cross-Lingual Textual Entailment Corpora , 2011, EMNLP.

[12]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[13]  Sanda M. Harabagiu,et al.  Methods for Using Textual Entailment in Open-Domain Question Answering , 2006, ACL.

[14]  Marcello Federico,et al.  Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment , 2011, ACL.

[15]  Francis M. Tyers,et al.  Apertium: a free/open-source platform for rule-based machine translation , 2011, Machine Translation.

[16]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.