Evaluation of an extraction-based approach to answering definitional questions

This paper evaluates an extraction-based approach to answering definitional questions. Our system extracted useful linguistic constructs called linguistic features from raw text using information extraction tools and formulated answers based on such features. The features employed include appositives, copulas, structured patterns, relations, propositions and raw sentences. The features were ranked based on feature type and similarity to a question profile. Redundant features were detected using a simple heuristic-based strategy. The approach achieved state of the art performance at the TREC 2003 QA evaluation. Component analysis of the system was carried out using an automatic scoring function called Rouge (Lin and Hovy, 2003). Major findings include 1) answers using linguistic features are significantly better than those using raw sentences; 2) the most useful features are appositives and copulas; 3) question profiles, as a means of modeling user interests, can significantly improve system performance; 4) the Rouge scores are closely correlated with subjective evaluation results, indicating the suitability of using Rouge for evaluating definitional QA systems.

[1]  Kathleen McKeown,et al.  A Hybrid Approach for QA Track Definitional Questions , 2003, TREC.

[2]  Sanda M. Harabagiu,et al.  Answer Mining by Combining Extraction Techniques with Abductive Reasoning , 2003, Text Retrieval Conference.

[3]  Inderjeet Mani,et al.  Producing Biographical Summaries: Combining Linguistic Knowledge with Corpus Statistics , 2001, ACL.

[4]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[5]  David A. Hull Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.

[6]  Sergey Bratus,et al.  Experiments in Multi-Modal Automatic Content Extraction , 2001, HLT.

[7]  Daniel Marcu,et al.  Multiple-Engine Question Answering in TextMap , 2003, TREC.

[8]  Jennifer Chu-Carroll,et al.  IBM's PIQUANT in TREC2003 , 2003, TREC.

[9]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[10]  Jinxi Xu,et al.  A Hybrid Approach to Answering Biographical Questions , 2004, New Directions in Question Answering.

[11]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[12]  Richard M. Schwartz,et al.  BBN: Description of the SIFT System as Used for MUC-7 , 1998, MUC.

[13]  Ellen M. Voorhees,et al.  Overview of the TREC 2002 Question Answering Track , 2003, TREC.

[14]  Richard M. Schwartz,et al.  A hidden Markov model information retrieval system , 1999, SIGIR '99.