Open-Domain Question-Answering

The top-performing Question-Answering (QA) systems have been of two types: consistent, solid, well-established and multi-faceted systems that do well year after year, and ones that come out of nowhere employing totally innovative approaches and which out-perform almost everybody else. This article examines both types of system in depth. We establish what a "typical" QA-system looks like, and cover the commonly used approaches by the component modules. Understanding this will enable any proficient system developer to build his own QA-system. Fortunately there are many components available for free from their developers to make this a reasonable expectation for a graduate-level project. We also look at particular systems that have performed well and which employ interesting and innovative approaches.

[1]  Krzysztof Czuba,et al.  Answering What-Is Questions by Virtual Annotation , 2001, HLT.

[2]  David Carmel,et al.  Searching XML documents via XML fragments , 2003, SIGIR.

[3]  Nina Wacholder,et al.  Disambiguation of Proper Names in Text , 1997, ANLP.

[4]  Rada Mihalcea,et al.  A Method for Word Sense Disambiguation of Unrestricted Text , 1999, ACL.

[5]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[6]  Gary G. Hendrix,et al.  Developing a natural language interface to complex data , 1977, TODS.

[7]  Bernardo Magnini,et al.  Is It the Right Answer? Exploiting Web Redundancy for Answer Validation , 2002, ACL.

[8]  Dragomir R. Radev,et al.  Ranking suspected answers to natural language questions using predictive annotation , 2000, ANLP.

[9]  Jimmy J. Lin,et al.  Quantitative evaluation of passage retrieval algorithms for question answering , 2003, SIGIR.

[10]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[11]  Jennifer Chu-Carroll,et al.  Question Answering Using Constraint Satisfaction: QA-By-Dossier-With-Contraints , 2004, ACL.

[12]  Richard M. Schwartz,et al.  An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[13]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[14]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[15]  Dragomir R. Radev,et al.  Question-answering by predictive annotation , 2000, SIGIR '00.

[16]  Sanda M. Harabagiu,et al.  High performance question/answering , 2001, SIGIR '01.

[17]  Jennifer Chu-Carroll,et al.  A Multi-Strategy and Multi-Source Approach to Question Answering , 2002, TREC.

[18]  Jimmy J. Lin,et al.  Will Pyramids Built of Nuggets Topple Over? , 2006, NAACL.

[19]  Sanda M. Harabagiu,et al.  Answer Mining by Combining Extraction Techniques with Abductive Reasoning , 2003, Text Retrieval Conference.

[20]  Tat-Seng Chua,et al.  Unsupervised learning of soft patterns for generating definitions from online news , 2004, WWW '04.

[21]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[22]  Jimmy J. Lin,et al.  Data-Intensive Question Answering , 2001, TREC.

[23]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[24]  Dragomir R. Radev,et al.  Mining the web for answers to natural language questions , 2001, CIKM '01.

[25]  José Luis Vicedo González,et al.  TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[26]  Jennifer Chu-Carroll,et al.  IBM's PIQUANT in TREC2003 , 2003, TREC.

[27]  Jimmy J. Lin,et al.  What Makes a Good Answer? The Role of Context in Question Answering , 2003, INTERACT.

[28]  Sanda M. Harabagiu,et al.  Advances in Open Domain Question Answering (Text, Speech and Language Technology) , 2006 .

[29]  Julian Kupiec,et al.  MURAX: a robust linguistic approach for question answering using an on-line encyclopedia , 1993, SIGIR.

[30]  Sanda M. Harabagiu,et al.  FALCON: Boosting Knowledge for Answer Engines , 2000, TREC.

[31]  Charles L. A. Clarke,et al.  Exploiting redundancy in question answering , 2001, SIGIR '01.

[32]  Jennifer Chu-Carroll,et al.  Semantic search via XML fragments: a high-precision approach to IR , 2006, SIGIR.

[33]  William John Teahan,et al.  Bangor at TREC 2003: Q&A and Genomics Tracks , 2003, TREC.

[34]  Jimmy J. Lin,et al.  Automatically Evaluating Answers to Definition Questions , 2005, HLT.

[35]  Jennifer Chu-Carroll,et al.  Statistical answer-type identification in open-domain question answering , 2002 .

[36]  Jimmy J. Lin,et al.  Omnibase: Uniform Access to Heterogeneous Data for Question Answering , 2002, NLDB.

[37]  Jimmy J. Lin An exploration of the principles underlying redundancy-based factoid question answering , 2007, TOIS.

[38]  Dan Roth,et al.  Learning question classifiers: the role of semantic information , 2005, Natural Language Engineering.

[39]  Avi Arampatzis,et al.  Phase-Based Information Retrieval , 1998, Inf. Process. Manag..

[40]  Jimmy J. Lin,et al.  Overview of the TREC 2007 Question Answering Track , 2008, TREC.

[41]  Tat-Seng Chua,et al.  National University of Singapore at the TREC 13 Question Answering Main Task , 2004, TREC.

[42]  Claire Cardie,et al.  SMART High Precision: TREC 7 , 1998, TREC.

[43]  Harris Wu,et al.  Probabilistic question answering on the web , 2002, WWW '02.

[44]  Jennifer Chu-Carroll,et al.  A Multi-Agent Approach to Using Redundancy and Reinforcement in Question Answering , 2004, New Directions in Question Answering.

[45]  Lynette Hirschman,et al.  Natural language question answering: the view from here , 2001, Natural Language Engineering.

[46]  William A. Woods,et al.  Progress in natural language understanding: an application to lunar geology , 1973, AFIPS National Computer Conference.

[47]  Dan Roth,et al.  Natural Language Inference via Dependency Tree Mapping: An Application to Question Answering , 2004 .

[48]  Wayne D. Gray,et al.  Basic objects in natural categories , 1976, Cognitive Psychology.

[49]  Fernando Pereira,et al.  AT&T at TREC-7 SDR Track , 2007 .

[50]  Oren Etzioni,et al.  Scaling question answering to the Web , 2001, WWW '01.

[51]  Jimmy J. Lin,et al.  The role of context in question answering systems , 2003, CHI Extended Abstracts.

[52]  Mark T. Maybury New Directions in Question Answering , 2004 .

[53]  Jennifer Chu-Carroll,et al.  Improving QA Accuracy by Question Inversion , 2006, ACL.

[54]  Jimmy J. Lin,et al.  Selectively Using Relations to Improve Precision in Question Answering , 2003 .

[55]  Jennifer Chu-Carroll,et al.  Type nanotheories: a framework for term comparison , 2007, CIKM '07.

[56]  Sanda M. Harabagiu,et al.  The Structure and Performance of an Open-Domain Question Answering System , 2000, ACL.

[57]  Martin M. Soubbotin,et al.  Use of Patterns for Detection of Likely Answer Strings: A Systematic Approach , 2002, TREC.

[58]  Ellen M. Voorhees,et al.  Building a question answering test collection , 2000, SIGIR '00.

[59]  Sasha Blair-Goldensohn,et al.  Answering Definitional Questions: A Hybrid Approach , 2004, New Directions in Question Answering.

[60]  Ellen M. Voorhees,et al.  TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing) , 2005 .

[61]  Sanda M. Harabagiu,et al.  The Role of Lexico-Semantic Feedback in Open-Domain Textual Question-Answering , 2001, ACL.

[62]  Terry Winograd,et al.  Procedures As A Representation For Data In A Computer Program For Understanding Natural Language , 1971 .

[63]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[64]  David H. D. Warren,et al.  An Efficient Easily Adaptable System for Interpreting Natural Language Queries , 1982, CL.

[65]  Sanda M. Harabagiu,et al.  LASSO: A Tool for Surfing the Answer Net , 1999, TREC.

[66]  Jinxi Xu,et al.  TREC 2003 QA at BBN: Answering Definitional Questions , 2003, TREC.

[67]  Jianfeng Gao,et al.  Dependence language model for information retrieval , 2004, SIGIR '04.

[68]  Sanda M. Harabagiu,et al.  COGEX: A Logic Prover for Question Answering , 2003, NAACL.

[69]  Martin M. Soubbotin Patterns of Potential Answer Expressions as Clues to the Right Answers , 2001, TREC.

[70]  Vasile Rus,et al.  Logic Form Transformation of WordNet and its Applicability to Question Answering , 2001, ACL.

[71]  D Marr,et al.  Early processing of visual information. , 1976, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[72]  Grace Hui Yang,et al.  Structured use of external knowledge for event-based open domain question answering , 2003, SIGIR.

[73]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[74]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[75]  Daniel Marcu,et al.  A Noisy-Channel Approach to Question Answering , 2003, ACL.

[76]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[77]  Boris Katz,et al.  Annotating the World Wide Web using Natural Language , 1997, RIAO.

[78]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[79]  Salim Roukos,et al.  IBM's Statistical Question Answering System-TREC 11 , 2001, TREC.