The Task First, Please

examine the current state of evaluation exercises for automatic Question Answering (QA) systems, specifically targeting the QA task (QA@CLEF) as it is being evaluating with the setting of the Cross-Language Evaluation Forum (CLEF). We describe several key issues for the evaluation of QA systems and show how they are problematic in the current setup of the tasks at QA@CLEF. We argue that many of the problems are caused by the lack of a clear understanding of the QA task that should include potential users, types of information needs, types of available information re- sources. Finally, we propose several scenarios for QA and focused retrieval tasks that address these problematic issues. Our main con- clusion is simple but important: a clear task definition is paramount for a meaningful evaluation of automatic systems, as evidenced by the overview of the QA evaluation setups.

[1]  Jimmy J. Lin,et al.  What Makes a Good Answer? The Role of Context in Question Answering , 2003, INTERACT.

[2]  Valentin Jijkoun,et al.  Retrieving answers from frequently asked questions pages on the web , 2005, CIKM '05.

[3]  Valentin Jijkoun,et al.  WiQA: Evaluating Multi-lingual Focused Access to Wikipedia , 2007, EVIA@NTCIR.

[4]  Sadaoki Furui,et al.  Factoid Question Answering with Web, Mobile and Speech Interfaces , 2006, NAACL.

[5]  Karen Sparck Jones Is question answering a rational task , 2003 .

[6]  Bogdan Sacaleanu,et al.  Overview of the CLEF 2008 Multilingual Question Answering Track , 2008, CLEF.

[7]  Nina Wacholder,et al.  HITIQA: Towards Analytical Question Answering , 2004, COLING.

[8]  Valentin Jijkoun,et al.  Overview of the CLEF 2006 Multilingual Question Answering Track , 2006, CLEF.

[9]  Valentin Jijkoun,et al.  Overview of the WiQA Task at CLEF 2006 , 2006, CLEF.

[10]  Tsuneaki Kato,et al.  An Overview of the 4th Question Answering Challenge (QAC-4) at NTCIR Workshop 6 , 2007, NTCIR.

[11]  Ellen M. Voorhees,et al.  Overview of the TREC-9 Question Answering Track , 2000, TREC.

[12]  Maarten de Rijke,et al.  The Multiple Language Question Answering Track at CLEF 2003 , 2003, CLEF.

[13]  Jimmy J. Lin,et al.  Overview of the TREC 2006 ciQA task , 2007, SIGF.

[14]  Maarten de Rijke,et al.  Overview of the CLEF 2004 Multilingual Question Answering Track , 2004, CLEF.

[15]  Charles L. A. Clarke,et al.  Exploiting redundancy in question answering , 2001, SIGIR '01.