In Question Answering, Two Heads Are Better Than One

Motivated by the success of ensemble methods in machine learning and other areas of natural language processing, we developed a multi-strategy and multi-source approach to question answering which is based on combining the results from different answering agents searching for answers in multiple corpora. The answering agents adopt fundamentally different strategies, one utilizing primarily knowledge-based mechanisms and the other adopting statistical techniques. We present our multi-level answer resolution algorithm that combines results from the answering agents at the question, passage, and/or answer levels. Experiments evaluating the effectiveness of our answer resolution algorithm show a 35.0% relative improvement over our baseline system in the number of questions correctly answered, and a 32.8% improvement according to the average precision metric.

[1]  Eric Brill,et al.  Exploiting Diversity in Natural Language Processing: Combining Parsers , 1999, EMNLP.

[2]  Sergei Nirenburg,et al.  Three Heads are Better than One , 1994, ANLP.

[3]  Jennifer Chu-Carroll,et al.  A Multi-Strategy and Multi-Source Approach to Question Answering , 2002, TREC.

[4]  Sanda M. Harabagiu,et al.  The Structure and Performance of an Open-Domain Question Answering System , 2000, ACL.

[5]  Charles L. A. Clarke,et al.  Exploiting redundancy in question answering , 2001, SIGIR '01.

[6]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[7]  Krzysztof Czuba,et al.  Answering What-Is Questions by Virtual Annotation , 2001, HLT.

[8]  Eduard H. Hovy,et al.  Question Answering in Webclopedia , 2000, TREC.

[9]  Ted Pedersen,et al.  A Simple Approach to Building Ensembles of Naive Bayesian Classifiers for Word Sense Disambiguation , 2000, ANLP.

[10]  Adwait Ratnaparkhi,et al.  Question Answering Using Maximum-Entropy Components , 2001, NAACL.

[11]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[12]  Jennifer Chu-Carroll,et al.  Hybridization in Question Answering Systems , 2003, New Directions in Question Answering.

[13]  Charles L. A. Clarke,et al.  Statistical Selection of Exact Answers (MultiText Experiments for TREC 2002) , 2002, TREC.

[14]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[15]  Dragomir R. Radev,et al.  Question-answering by predictive annotation , 2000, SIGIR '00.

[16]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[17]  Ellen M. Voorhees,et al.  Overview of the TREC-9 Question Answering Track , 2000, TREC.

[18]  Eric Brill,et al.  Classifier Combination for Improved Lexical Disambiguation , 1998, ACL.

[19]  Ellen M. Voorhees,et al.  Overview of the TREC 2002 Question Answering Track , 2003, TREC.

[20]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..

[21]  Richard J. Mammone,et al.  Trainable question-answering systems , 2001 .

[22]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.