Extracting Context-Rich Entailment Rules from Wikipedia Revision History

Recent work on Textual Entailment has shown a crucial role of knowledge to support entailment inferences. However, it has also been demonstrated that currently available entailment rules are still far from being optimal. We propose a methodology for the automatic acquisition of large scale context-rich entailment rules from Wikipedia revisions, taking advantage of the syntactic structure of entailment pairs to define the more appropriate linguistic constraints for the rule to be successfully applicable. We report on rule acquisition experiments on Wikipedia, showing that it enables the creation of an innovative (i.e. acquired rules are not present in other available resources) and good quality rule repository.

[1]  Patrick Pantel,et al.  Discovery of inference rules for question-answering , 2001, Natural Language Engineering.

[2]  P. Nabende,et al.  Proceedings of the seventh conference on International Language Resources and Evaluation , 2010 .

[3]  Adrian Iftene,et al.  UAIC Participation at RTE4 , 2008, TAC.

[4]  S. H I Q I Z H A O,et al.  Extracting paraphrase patterns from bilingual parallel corpora , 2009 .

[5]  Ido Dagan,et al.  Addressing Discourse and Document Structure in the RTE Search Task , 2009, TAC.

[6]  Guillaume Wisniewski,et al.  Mining Naturally-occurring Corrections and Paraphrases from Wikipedia’s Revision History , 2022, LREC.

[7]  C. Condoravdi,et al.  Computing relative polarity for textual inference , 2006 .

[8]  Houda Bouamor,et al.  Local modifications and paraphrases in Wikipedia's revision history , 2011, Proces. del Leng. Natural.

[9]  Satoshi Sekine,et al.  Automatic Paraphrase Discovery based on Context and Keywords between NE Pairs , 2005, IJCNLP.

[10]  Hermann Ney,et al.  Accelerated DP based search for statistical translation , 1997, EUROSPEECH.

[11]  Ido Dagan,et al.  Recognizing textual entailment: Rational, evaluation and approaches , 2009 .

[12]  Ido Dagan,et al.  Building Textual Entailment Specialized Data Sets: a Methodology for Isolating Linguistic Phenomena Relevant to Inference , 2010, LREC.

[13]  Ido Dagan,et al.  Instance-based Evaluation of Entailment Rule Acquisition , 2007, ACL.

[14]  Peter Clark,et al.  The Seventh PASCAL Recognizing Textual Entailment Challenge , 2011, TAC.

[15]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[16]  Ido Dagan,et al.  Recognizing textual entailment: Rational, evaluation and approaches , 2009, Natural Language Engineering.

[17]  Ido Dagan,et al.  Scaling Web-based Acquisition of Entailment Relations , 2004, EMNLP.

[18]  Chris Callison-Burch,et al.  Syntactic Constraints on Paraphrases Extracted from Parallel Corpora , 2008, EMNLP.

[19]  Ido Dagan,et al.  Learning Entailment Rules for Unary Templates , 2008, COLING.

[20]  Elif Yamangil,et al.  Mining Wikipedia's Article Revision History for Training Computational Linguistics Algorithms , 2008 .

[21]  Ido Dagan,et al.  The Sixth PASCAL Recognizing Textual Entailment Challenge , 2009, TAC.

[22]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[23]  Regina Barzilay,et al.  Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment , 2003, NAACL.

[24]  Ido Dagan,et al.  Generating Entailment Rules from FrameNet , 2010, ACL.

[25]  Fabio Massimo Zanzotto,et al.  Expanding textual entailment corpora fromWikipedia using co-training , 2010, PWNLP@COLING.

[26]  Elena Cabrio,et al.  Using Lexical Resources in a Distance-Based Approach to RTE , 2009, TAC.

[27]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[28]  Cristian Danescu-Niculescu-Mizil,et al.  For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia , 2010, NAACL.