External Knowledge Sources for Question Answering

MIT CSAIL’s entries for the TREC Question Answering track (Voorhees, 2005) focused on incorporating external general-knowledge sources into the question answering process. We also explored the effect of document retrieval on factoid question answering, in cooperation with a community focus on document retrieval. For the new relationship task, we present a new passage-retrieval based algorithm emphasizing synonymy, which performed best among automatic systems this year. Our most prominent new external knowledge source is the Wikipedia1, and its most useful component is the synonymy implicit in its subtitles and redirect link structure. Wikipedia is also a large new source of hypernym information. The main task included factoid questions, for which we modified the freely available Web-based Aranea question answering engine; list questions, for which we used hypernym hierarchies to constrain candidate answers; and definitional ‘other’ questions, for which we combined candidate snippets generated by several previous definition systems using a new novelty-based reranking method inspired by (Allan et al., 2003). Our factoid engine, Aranea2 (Lin and Katz, 2003), uses the World Wide Web to find candidate answers to the given question, and then projects its best candidates onto the newspaper corpus, choosing the one best sup-

[1]  Inderjeet Mani,et al.  How to Evaluate Your Question Answering System Every Day ... and Still Get Real Work Done , 2000, LREC.

[2]  Jimmy J. Lin,et al.  Omnibase: Uniform Access to Heterogeneous Data for Question Answering , 2002, NLDB.

[3]  Alexey Radul,et al.  Nuggeteer: Automatic Nugget-Based Evaluation using Descriptions and Judgements , 2006, NAACL.

[4]  James Allan,et al.  Retrieval and novelty detection at the sentence level , 2003, SIGIR.

[5]  Ralph Grishman,et al.  NOMLEX: a lexicon of nominalizations , 1998 .

[6]  Jimmy J. Lin,et al.  Automatically Evaluating Answers to Definition Questions , 2005, HLT.

[7]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[8]  Jimmy J. Lin,et al.  What Works Better for Question Answering: Stemming or Morphological Query Expansion? , 2004 .

[9]  Jimmy J. Lin,et al.  Quantitative evaluation of passage retrieval algorithms for question answering , 2003, SIGIR.

[10]  Ellen M. Voorhees,et al.  Overview of the TREC-9 Question Answering Track , 2000, TREC.

[11]  Richard M. Schwartz,et al.  An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[12]  Jimmy J. Lin,et al.  Question answering from the web using knowledge annotation and knowledge mining techniques , 2003, CIKM '03.

[13]  Boris Katz,et al.  Using English for Indexing and Retrieving , 1991 .

[14]  Aaron D. Fernandes Answering definitional questions before they are asked , 2004 .

[15]  Boris Katz,et al.  Annotating the World Wide Web using Natural Language , 1997, RIAO.