On Evaluating the Contribution of Validation for Question Answering

Validation is arising as a crucial component of new architectures aimed at improving Question Answering technologies. Hence, there is a strong need of appropriate measures for evaluating Validation and, even more, its impact on question answering results. However, common Validation measures do not allow a clear study of this impact, and they might even lead researchers to obtain wrong conclusions. We propose a new approach for evaluating Validation technologies, which offers clear and useful information about the impact on QA performance. We compare our proposal with classic evaluation measures, showing the benefits of our scheme.

[1]  M. Felisa Verdejo,et al.  Overview of the Answer Validation Exercise 2007 , 2006, CLEF.

[2]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[3]  Tim Niblett,et al.  Constructing Decision Trees in Noisy Domains , 1987, EWSL.

[4]  Eduard H. Hovy,et al.  Overview of QA4MRE at CLEF 2011: Question Answering for Machine Reading Evaluation , 2011, CLEF.

[5]  M. Felisa Verdejo,et al.  Testing the Reasoning for Question Answering Validation , 2008, J. Log. Comput..

[6]  Adrian Iftene Building a Textual Entailment System for the RTE3 Competition. Application to a QA System , 2008, 2008 10th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.

[7]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[8]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[9]  Bernardo Magnini,et al.  Is It the Right Answer? Exploiting Web Redundancy for Answer Validation , 2002, ACL.

[10]  Anselmo Peñas,et al.  A Simple Measure to Assess Non-response , 2011, ACL.

[11]  Arthur C. Ciccolo,et al.  Towards the Open Advancement of Question Answering Systems December 2008 , 2009 .

[12]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Evaluation , 2000, TREC.

[13]  Noriko Kando,et al.  Are open-domain question answering technologies useful for information access dialogues?---an empirical study and a proposal of a novel challenge , 2005, TALIP.

[14]  M. Felisa Verdejo,et al.  Evaluating question answering validation as a classification problem , 2012, Lang. Resour. Evaluation.

[15]  M. Felisa Verdejo,et al.  Overview of the Answer Validation Exercise 2007 , 2007, CLEF.

[16]  Eduard H. Hovy,et al.  Question Answering in Webclopedia , 2000, TREC.

[17]  Inderjeet Mani,et al.  How to Evaluate Your Question Answering System Every Day ... and Still Get Real Work Done , 2000, LREC.

[18]  Sanda M. Harabagiu,et al.  Methods for Using Textual Entailment in Open-Domain Question Answering , 2006, ACL.