Evaluating Causal Questions for Question Answering

Question answering systems involve the extraction of answers to a question rather than retrieval of relevant documents. For question answering evaluation, it is necessary that a human assessor decide the correctness of the answers, given that the same answer can be expressed in different ways. Therefore, the use of suitable test collection could help to identify where the systems are performing well, or where they are failing. Our data collection analysis suggests that there should be a proportion of text in which the reasoning or explanation that constitutes an answer to a ldquowhyrdquo question is present in, or capable of extracting from, the source text. We report on an implemented component for the extraction of candidate answers from source text. This component uses an approach that combines lexical overlapping and lexical semantic relatedness (lexico-syntactic approach) for ranking possible answers to causal questions. On undifferentiated texts, we obtain an overall recall of 34.13% indicating that simple match is adequate for answering over 1/3 of ldquowhyrdquo questions. We have analyzed those question-answer pairs units where the answer is explicit, ambiguous and implicit, and shown that if we can separate the last category, the rate of recall increases considerably.