Retrieving Passages and Finding Answers

Retrieving topically-relevant text passages in documents has been studied many times, but finding non-factoid, multiple sentence answers to web queries is a different task that is becoming increasingly important for applications such as mobile search. As the first stage of developing retrieval models for "answer passages", we describe the process of creating a test collection of questions and multiple-sentence answers based on the TREC GOV2 queries and documents. This annotation shows that most of the description-length TREC queries do in fact have passage-level answers. We then examine the effectiveness of current passage retrieval models in terms of finding passages that contain answers. We show that the existing methods are not effective for this task, and also observe that the relative performance of these methods in retrieving answers does not correspond to their performance in retrieving relevant documents.

[1]  Noriko Kando,et al.  Using graded-relevance metrics for evaluating community QA answer selection , 2011, WSDM '11.

[2]  Dell Zhang,et al.  A Language Modeling Approach to Passage Question Answering , 2003, TREC.

[3]  William R. Hersh,et al.  A comparative analysis of retrieval features used in the TREC 2006 Genomics Track passage retrieval task , 2007, AMIA.

[4]  James Allan,et al.  HARD Track Overview in TREC 2004 (Notebook) High Accuracy Retrieval from Documents , 2004 .

[5]  James P. Callan,et al.  Passage-level evidence in document retrieval , 1994, SIGIR '94.

[6]  W. Bruce Croft,et al.  Passage retrieval based on language models , 2002, CIKM '02.

[7]  Oren Kurland,et al.  Position-based contextualization for passage retrieval , 2013, CIKM.

[8]  Gabriella Kazai,et al.  INEX 2006 Evaluation Measures , 2006, INEX.

[9]  Oren Kurland,et al.  Utilizing Passage-Based Language Models for Document Retrieval , 2008, ECIR.

[10]  Marti A. Hearst,et al.  TREC 2007 Genomics Track Overview , 2007, TREC.

[11]  James Allan,et al.  HARD Track Overview in TREC 2003: High Accuracy Retrieval from Documents , 2003, TREC.

[12]  Gabriella Kazai,et al.  INEX 2007 Evaluation Measures , 2008, INEX.

[13]  Tapas Kanungo,et al.  Machine Learned Sentence Selection Strategies for Query-Biased Summarization , 2008 .

[14]  Mihai Surdeanu,et al.  Learning to Rank Answers to Non-Factoid Questions from Web Collections , 2011, CL.

[15]  W. Bruce Croft,et al.  Passage retrieval based on language models , 2002, CIKM.

[16]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.