Analysis of Statistical Question Classification for Fact-Based Questions

Question classification systems play an important role in question answering systems and can be used in a wide range of other domains. The goal of question classification is to accurately assign labels to questions based on expected answer type. Most approaches in the past have relied on matching questions against hand-crafted rules. However, rules require laborious effort to create and often suffer from being too specific. Statistical question classification methods overcome these issues by employing machine learning techniques. We empirically show that a statistical approach is robust and achieves good performance on three diverse data sets with little or no hand tuning. Furthermore, we examine the role different syntactic and semantic features have on performance. We find that semantic features tend to increase performance more than purely syntactic features. Finally, we analyze common causes of misclassification error and provide insight into ways they may be overcome.

[1]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[2]  Dragomir R. Radev,et al.  The Use of Predictive Annotation for Question Answering in TREC8 , 1999, TREC.

[3]  Ronald Rosenfeld,et al.  A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..

[4]  Harris Wu,et al.  Probabilistic question answering on the web , 2002, WWW '02.

[5]  Adam Pease,et al.  Towards a standard upper ontology , 2001, FOIS.

[6]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[7]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.

[9]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[10]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[11]  Richard M. Schwartz,et al.  An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[12]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[13]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[14]  Eduard H. Hovy,et al.  Toward Semantics-Based Answer Pinpointing , 2001, HLT.

[15]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[16]  Sanda M. Harabagiu,et al.  High performance question/answering , 2001, SIGIR '01.

[17]  R. David Lankes,et al.  The current state of digital reference: validation of a general digital reference model through a survey of digital reference services , 2004, Inf. Process. Manag..

[18]  David A. Hull Xerox TREC-8 Question Answering Track Report , 1999, TREC.

[19]  W. Bruce Croft,et al.  Task orientation in question answering , 2002, SIGIR '02.

[20]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[21]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Evaluation , 2000, TREC.

[22]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[23]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Novelty Track. , 2005 .

[24]  Eduard Hovy,et al.  A question/answer typology with surface text patterns , 2002 .

[25]  Ellen M. Voorhees,et al.  Overview of the TREC-9 Question Answering Track , 2000, TREC.

[26]  Key-Sun Choi,et al.  TREC-9 Experiments at KAIST: QA, CLIR and Batch Filtering , 2000, Text Retrieval Conference.

[27]  Jaime G. Carbonell,et al.  The JAVELIN Question-Answering System at TREC 2003: A Multi-Strategh Approach with Dynamic Planning , 2003, TREC.

[28]  Adwait Ratnaparkhi,et al.  IBM's Statistical Question Answering System , 2000, TREC.

[29]  Krzysztof Czuba,et al.  Answering What-Is Questions by Virtual Annotation , 2001, HLT.

[30]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[31]  Evgeniy Gabrilovich,et al.  Parameterized generation of labeled datasets for text categorization based on a hierarchical directory , 2004, SIGIR '04.

[32]  SchwartzRichard,et al.  An Algorithm that Learns Whats in a Name , 1999 .

[33]  Katharina Morik,et al.  Combining Statistical Learning with a Knowledge-Based Approach - A Case Study in Intensive Care Monitoring , 1999, ICML.

[34]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.