A statistical classification approach to question answering using Web data

In this paper we treat question answering (QA) as a classification problem. Our motivation is to build systems for many languages without the need for highly tuned linguistic modules. Consequently, word tokens and Web data are used extensively but no explicit linguistic knowledge is incorporated. A mathematical model for answer retrieval, answer classification and answer length prediction is derived. The TREC 2002 QA task is used for system development where 33% of questions are answered correctly. Performance is then evaluated on the factoid questions of the TREC 2003 QA task where 23% of questions were answered correctly, which would rank the system in the top 10 of contemporary QA systems on the same task

[1]  Charles L. A. Clarke,et al.  Exploiting redundancy in question answering , 2001, SIGIR '01.

[2]  Eric Brill,et al.  Automatic Question Answering: Beyond the Factoid , 2004, NAACL.

[3]  Susan T. Dumais,et al.  An Analysis of the AskMSR Question-Answering System , 2002, EMNLP.

[4]  Salim Roukos,et al.  IBM's Statistical Question Answering System-TREC 11 , 2001, TREC.

[5]  Ellen M. Voorhees,et al.  Overview of the TREC 2002 Question Answering Track , 2003, TREC.

[6]  Jimmy J. Lin,et al.  Web question answering: is more always better? , 2002, SIGIR '02.

[7]  Oren Etzioni,et al.  Scaling question answering to the Web , 2001, WWW '01.

[8]  Scott Miller,et al.  TREC 2002 QA at BBN: Answer Selection and Confidence Estimation , 2002, TREC.

[9]  Eduard Hovy,et al.  Statistical QA - Classifier vs. Re-ranker: What’s the difference? , 2003, ACL 2003.

[10]  Sanda M. Harabagiu,et al.  The Informative Role of WordNet in Open-Domain Question Answering , 2004, HLT-NAACL 2004.

[11]  Harris Wu,et al.  Probabilistic question answering on the web , 2002, WWW '02.

[12]  Jennifer Chu-Carroll,et al.  Use of WordNet Hypernyms for Answering What-Is Questions , 2001, TREC.

[13]  Dan I. Moldovan VOICE-ACTIVATED QUESTION ANSWERING , 2006, 2006 IEEE Spoken Language Technology Workshop.

[14]  Daniel Marcu,et al.  A Noisy-Channel Approach to Question Answering , 2003, ACL.

[15]  Eduard H. Hovy,et al.  The Use of External Knowledge of Factoid QA , 2001, TREC.

[16]  Eric Brill,et al.  Automatic question answering using the web: Beyond the Factoid , 2006, Information Retrieval.

[17]  Vibhu O. Mittal,et al.  Bridging the lexical chasm: statistical approaches to answer-finding , 2000, SIGIR '00.

[18]  Sanda M. Harabagiu,et al.  LCC Tools for Question Answering , 2002, TREC.