The Role of Semantic Information in Learning Question Classifiers

Question Classification is commonly used in question answering systems to perform a semantic classification of the target answer in an effort to provide additional information to downstream processes. It is different from the common text categorization task in the sense that questions are relatively short and contain less word-based information compared with classification of the entire text. This work presents a machine learning approach to this task. Our approach is to augment the questions with syntactic and semantic analysis, as well as external semantic knowledge, as input to the text classifier. It is shown that, in the context of question classification, augmenting the input of the classifier with appropriate semantic category information results in significant improvements to classification accuracy.

[1]  Dan Roth,et al.  The Use of Classifiers in Sequential Inference , 2001, NIPS.

[2]  Dan Roth,et al.  Learning to Resolve Natural Language Ambiguities: A Unified Approach , 1998, AAAI/IAAI.

[3]  Jun Suzuki,et al.  Question Classification using HDAG Kernel , 2003, ACL 2003.

[4]  Eric Brill,et al.  A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation , 1994, COLING.

[5]  Karen Sparck Jones,et al.  Okapi at TREC{7: automatic ad hoc, ltering, VLC and interactive track , 1999 .

[6]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[7]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[8]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[9]  Lillian Lee,et al.  Measures of Distributional Similarity , 1999, ACL.

[10]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[11]  Sanda M. Harabagiu,et al.  LCC Tools for Question Answering , 2002, TREC.

[12]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[13]  Ulf Hermjakob,et al.  Parsing and Question Classification for Question Answering , 2001, ACL 2001.

[14]  Ellen M. Voorhees,et al.  Overview of the TREC 2002 Question Answering Track , 2003, TREC.

[15]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[16]  Dan Roth,et al.  Incorporating Knowledge in Natural Language Learning: A Case Study , 1998, WordNet@ACL/COLING.

[17]  Dan Roth,et al.  A Sequential Model for Multi-Class Classification , 2001, EMNLP.

[18]  Eduard H. Hovy,et al.  Toward Semantics-Based Answer Pinpointing , 2001, HLT.