This paper presents methods for answering, what we call, Cross-passage Evidence Questions. These questions require multiply scattered passages all bearing different and partial evidence for the answers. This poses special challenges to the textual QA systems that employ information retrieval in the “conventional” way because the ensuing Answer Extraction operation assumes that one of the passages retrieved would, by itself, contain sufficient evidence to recognize and extract the answer. One method that may overcome this problem is factoring a Cross-passage Evidence Question into constituent sub-questions and joining the respective answers. The first goal of this paper is to develop and put this method into test to see how indeed effective this method could be. Then, we introduce another method, Direct Answer Retrieval, which rely on extensive pre-processing to collect different evidence for a possible answer off-line. We conclude that the latter method is superior both in the correctness of the answers and the overall efficiency in dealing with Crosspassage Evidence Questions. 1 Distinguishing Questions Based on Evidence Locality Textual factoid Question Answering depends on the existence of at least one passage or text span in the corpus that can serve as sufficient evidence for the question. A single piece of evidence may suffice to answer a question, or more than a single piece may be needed. By “a piece of evidence”, we mean a snippet of continuous text, or passage, that supports or justifies an answer to the question posed. More practically, in factoid QA, a piece of evidence is a text span with two properties: (1) An Information Retrieval (IR) procedure can recognise it as relevant to the question and (2) an automated Answer Extraction (AE) procedure can extract from it an answerbearing expression (aka an answer candidate). With respect to a given corpus, we call questions with the following property Single Passage Evidence Questions or SEQs: A question Q is a SEQ if evidence E sufficient to select A as an answer to Q can be found in the same text snippet as A. In contrast, we call a question that requires multiple different pieces of evidence (in multiple text spans with respect to a corpus) a Cross-passage Evidence Question or CEQ: A question Q is a CEQ if the set of evidence E1, ..., En needed to justify A as an answer to Q cannot be found in a single text snippet containing A, but only in a set of such snippets. For example, consider the following question: Which Sub-Saharan country had hosted the World Cup? If the evidence for the country being located south of Sahara dessert and the evidence for this same country having hosted the World Cup is not contained in the same passage/sentence, but are found
[1]
Jimmy J. Lin,et al.
AskMSR: Question Answering Using the Worldwide Web
,
2002
.
[2]
Bonnie L. Webber,et al.
Nexus: a real time QA system
,
2007,
SIGIR.
[3]
Jennifer Chu-Carroll,et al.
Building Watson: An Overview of the DeepQA Project
,
2010,
AI Mag..
[4]
Kisuh Ahn,et al.
Topic Indexing and Retrieval for Factoid QA
,
2008,
COLING 2008.
[5]
F. Wilcoxon.
Individual Comparisons by Ranking Methods
,
1945
.
[6]
Malvina Nissim,et al.
Question Answering with QED at TREC 2005
,
2005,
TREC.
[7]
Jennifer Chu-Carroll,et al.
A Multi-Strategy and Multi-Source Approach to Question Answering
,
2002,
TREC.
[8]
Rafael Muñoz,et al.
Splitting Complex Temporal Questions for Question Answering Systems
,
2004,
ACL.
[9]
Gerhard Weikum,et al.
WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge
,
2022
.
[10]
Susan T. Dumais,et al.
An Analysis of the AskMSR Question-Answering System
,
2002,
EMNLP.
[11]
R. Sutcliffe,et al.
Seeking an Upper Bound to Sentence Level Retrieval in Question Answering
,
2004
.
[12]
W. Bruce Croft,et al.
Inference networks for document retrieval
,
1989,
SIGIR '90.
[13]
Shafiq R. Joty,et al.
Selecting Sentences for Answering Complex Questions
,
2008,
EMNLP.