Knowledge and Reasoning for Answering Questions

Research has shown that answers do not exist in biomedical corpora for many questions posed by physicians. We have therefore developed a question filtering component that determines whether or not a posed question is answerable. Using 200 clinical questions that have been annotated by physicians to be answerable or unanswerable, we have explored the use of supervised machine-learning algorithms to automatically classify questions into one of these two categories. We also have incorporated semantic features from a large biomedical knowledge terminology. Our results show that incorporating semantic features in general enhances the performance of question classification and the best system is a probabilistic indexing system that achieves an 80.5% accuracy. Our analysis also shows that stop words may play an important role for separating Answerable from Unanswerable.

[1]  S. Griffis EDITOR , 1997, Journal of Navigation.

[2]  M. Greenwood AnswerFinder : Question Answering from your Desktop , 2003 .

[3]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[4]  Wayne H. Ward,et al.  Question Classification with Support Vector Machines and Error Correcting Codes , 2003, HLT-NAACL.

[5]  Walter Daelemans,et al.  Complex answers: a case study using a WWW question answering system , 2001, Natural Language Engineering.

[6]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[7]  Satoshi Sekine,et al.  Automatic paraphrase acquisition from news articles , 2002 .

[8]  M. de Rijke,et al.  Light-weight inference for computational semantics , 2001 .

[9]  Plantage Muidergracht Inference in Computational Semantics , 2000 .

[10]  Jennifer Chu-Carroll,et al.  Use of WordNet Hypernyms for Answering What-Is Questions , 2001, TREC.

[11]  Lenhart K. Schubert,et al.  From English to Logic: Context-Free Computation of ‘Conventional’ Logical Translation , 1982, CL.

[12]  Gideon S. Mann,et al.  Analyses for elucidating current question answering technology , 2001, Natural Language Engineering.

[13]  Jun Suzuki,et al.  SVM Answer Selection for Open-Domain Question Answering , 2002, COLING.

[14]  Peter Wagner,et al.  An Interactive Dialogue System for Knowledge Acquisition in Cyc , 2003, IJCAI 2003.

[15]  Dan Roth,et al.  The Necessity of Syntactic Parsing for Semantic Role Labeling , 2005, IJCAI.

[16]  Dan Roth,et al.  A Sequential Model for Multi-Class Classification , 2001, EMNLP.

[17]  Susan T. Dumais,et al.  An Analysis of the AskMSR Question-Answering System , 2002, EMNLP.

[18]  Dan Roth,et al.  An Inference Model for Semantic Entailment in Natural Language , 2005, IJCAI.

[19]  Paulo Quaresma,et al.  The University of Évora approach to QA@CLEF-2004 , 2004, CLEF.

[20]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[21]  Bernhard Thalheim,et al.  Generating DB Queries for Web NL Requests Using Schema Information and DB Content , 2001, NLDB.

[22]  Steffen Staab,et al.  Project Halo: Towards a Digital Aristotle , 2004, AI Mag..

[23]  Dekang Lin,et al.  DIRT – Discovery of Inference Rules from Text , 2001 .

[24]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[25]  Jun Suzuki,et al.  Question Classification using HDAG Kernel , 2003, ACL 2003.

[26]  Regina Barzilay,et al.  Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment , 2003, NAACL.

[27]  Parke Godfrey,et al.  An overview of cooperative answering , 1992, Journal of Intelligent Information Systems.

[28]  Dan Roth,et al.  Learning with Feature Description Logics , 2002, ILP.

[29]  Roxana Girju Answer Fusion with On-line Ontology Development , 2001, HTL 2001.

[30]  Eduard Hovy,et al.  A question/answer typology with surface text patterns , 2002 .

[31]  Elizabeth D. Liddy,et al.  Question Answering: CNLP at the TREC 2002 Question Answering Track , 2002, TREC.

[32]  Johan Bos,et al.  Question Answering with QED and Wee at TREC 2004 , 2004, TREC.

[33]  Carl Vogel,et al.  Proceedings of the 16th International Conference on Computational Linguistics , 1996, COLING 1996.

[34]  Ellen M. Voorhees,et al.  Overview of TREC 2003 , 2003, TREC.

[35]  Farah Benamara Cooperative Question Answering in Restricted Domains: the WEBCOOP Experiment , 2004 .

[36]  David G. Stork,et al.  Pattern Classification , 1973 .

[37]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[38]  Jonathan J. Oliver Decision Graphs - An Extension of Decision Trees , 1993 .

[39]  Christian Jacquemin,et al.  Syntagmatic and Paradigmatic Representations of Term Variation , 1999, ACL.

[40]  Daniel Marcu,et al.  A Noisy-Channel Approach to Question Answering , 2003, ACL.

[41]  Adwait Ratnaparkhi,et al.  IBM's Statistical Question Answering System , 2000, TREC.

[42]  Wolfgang May Information Extraction and Integration with Florid: The MONDIAL Case Study , 1999 .

[43]  Anna Kupsc,et al.  Towards light semantic processing for question answering , 2003, HLT-NAACL 2003.

[44]  Ralph Grishman,et al.  Scenario customization for information extraction , 2000 .

[45]  Jimmy J. Lin,et al.  Data-Intensive Question Answering , 2001, TREC.

[46]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[47]  Ingrid Zukerman,et al.  Analyzing the Effect of Query Class on Document Retrieval Performance , 2004, Australian Conference on Artificial Intelligence.

[48]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[49]  David Baxter,et al.  Knowledge formation and dialogue using the KRAKEN toolset , 2002, AAAI/IAAI.

[50]  C. S. Wallace,et al.  Coding Decision Trees , 1993, Machine Learning.

[51]  Antonio Ferrandez,et al.  Importance of Pronominal Anaphora Resolution in Question Answering Systems , 2000, ACL 2000.

[52]  Ken Howard San Diego , 2003, Nature.

[53]  Johan Bos Towards Wide-Coverage Semantic Interpretation , 2005 .

[54]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[55]  Dan Roth,et al.  Semantic Role Labeling Via Integer Linear Programming Inference , 2004, COLING.

[56]  Karen Spärck Jones,et al.  Natural language interfaces to databases , 1990, The Knowledge Engineering Review.

[57]  Sanda M. Harabagiu,et al.  LCC Tools for Question Answering , 2002, TREC.

[58]  Ido Dagan,et al.  PROBABILISTIC TEXTUAL ENTAILMENT: GENERIC APPLIED MODELING OF LANGUAGE VARIABILITY , 2004 .

[59]  Robert C. Moore Problems in Logical Form , 1981, ACL.

[60]  Patrick Pantel,et al.  DIRT @SBT@discovery of inference rules from text , 2001, KDD '01.

[61]  Charles L. A. Clarke,et al.  Exploiting redundancy in question answering , 2001, SIGIR '01.

[62]  Inderjeet Mani,et al.  Summarizing Similarities and Differences Among Related Documents , 1997, Information Retrieval.

[63]  Sanda M. Harabagiu,et al.  Performance Issues and Error Analysis in an Open-Domain Question Answering System , 2002, ACL.

[64]  Dan Roth,et al.  Identification and Tracing of Ambiguous Names: Discriminative and Generative Approaches , 2004, AAAI.

[65]  Enrico Motta,et al.  AQUA - Ontology-Based Question Answering System , 2004, MICAI.

[66]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[67]  K. Markert,et al.  Combining Shallow and Deep NLP Methods for Recognizing Textual Entailment , 2005 .

[68]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[69]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[70]  Ingrid Zukerman,et al.  Query expansion and query reduction in document retrieval , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[71]  Johan Bos,et al.  Position statement: Inference in Question Answering , 2002 .

[72]  Diego Calvanese,et al.  The Description Logic Handbook , 2007 .

[73]  Sanda M. Harabagiu,et al.  High performance question/answering , 2001, SIGIR '01.

[74]  Grace Hui Yang,et al.  The Integration of Lexical Knowledge and External Resources for Question Answering , 2002, TREC.

[75]  Paulo Quaresma,et al.  A Methodology to Create Legal Ontologies in a Logic Programming Information Retrieval System , 2003, Law and the Semantic Web.

[76]  Mitchell P. Marcus,et al.  Adding Semantic Annotation to the Penn TreeBank , 1998 .

[77]  Jean-Pierre Chanod,et al.  Robustness beyond shallowness: incremental deep parsing , 2002, Natural Language Engineering.

[78]  Jennifer Chu-Carroll,et al.  A Machine-Learning Approach to Introspection in a Question Answering System , 2002, EMNLP.

[79]  Sanda M. Harabagiu,et al.  COGEX: A Logic Prover for Question Answering , 2003, NAACL.