A Survey of State-of-the-Art Methods on Question Classification

The task of question classification (QC) is to predict the entity type of a question which is written in natural language. This is done by classifying the question to a category from a set of predefined categories. Question classification is an important component of question answering systems and it attracted a notable amount of research since the past decade. This paper gives a comprehensive overview of the state-of-the-art approaches in question classification and provides a detailed comparison of recent works on question classification and discussed about possible extensions to QC problem.

[1]  Jianyi Guo,et al.  Question classification based on co-training style semi-supervised learning , 2010, Pattern Recognit. Lett..

[2]  Josef van Genabith,et al.  QuestionBank: Creating a Corpus of Parse-Annotated Questions , 2006, ACL.

[3]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[4]  Asli Çelikyilmaz,et al.  Investigation of Question Classifier in Question Answering , 2009, EMNLP.

[5]  Akira Shimazu,et al.  Subtree Mining for Question Classification Problem , 2007, IJCAI.

[6]  Peter Thanisch,et al.  Natural language interfaces to databases – an introduction , 1995, Natural Language Engineering.

[7]  W. Bruce Croft,et al.  Task orientation in question answering , 2002, SIGIR '02.

[8]  Ann Bies,et al.  Bracketing Guidelines For Treebank II Style Penn Treebank Project , 1995 .

[9]  Jimmy J. Lin,et al.  Omnibase: Uniform Access to Heterogeneous Data for Question Answering , 2002, NLDB.

[10]  Alex Smola,et al.  Kernel methods in machine learning , 2007, math/0701907.

[11]  Xian Zhang,et al.  Classifying What-Type Questions by Head Noun Tagging , 2008, COLING.

[12]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[13]  Alexander Clark,et al.  Inducing Syntactic Categories by Context Distribution Clustering , 2000, CoNLL/LLL.

[14]  Dan Roth,et al.  Learning question classifiers: the role of semantic information , 2005, Natural Language Engineering.

[15]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[16]  Dragomir R. Radev,et al.  The Use of Predictive Annotation for Question Answering in TREC8 , 1999, TREC.

[17]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[18]  Xuanjing Huang,et al.  Question Classification using Multiple Classifiers , 2005, ALR/ALRN@IJCNLP.

[19]  Yong Tang,et al.  Question classification with semantic tree kernel , 2008, SIGIR '08.

[20]  W. Bruce Croft,et al.  Analysis of Statistical Question Classification for Fact-Based Questions , 2005, Information Retrieval.

[21]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[22]  Dan Roth,et al.  The Use of Classifiers in Sequential Inference , 2001, NIPS.

[23]  Luísa Coheur,et al.  From symbolic to sub-symbolic information in question classification , 2011, Artificial Intelligence Review.

[24]  Pascal Wiggers,et al.  Question Classification by Weighted Combination of Lexical, Syntactic and Semantic Features , 2011, TSD.

[25]  Dan Roth,et al.  Learning to Resolve Natural Language Ambiguities: A Unified Approach , 1998, AAAI/IAAI.

[26]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[27]  W. Bruce Croft,et al.  A language modeling approach to information retrieval , 1998, SIGIR '98.

[28]  Zengchang Qin,et al.  Question Classification using Head Words and their Hypernyms , 2008, EMNLP.

[29]  Robert E. Schapire,et al.  Theoretical Views of Boosting and Applications , 1999, ALT.

[30]  Kenji Araki,et al.  Effectiveness of Combined Features for Machine Learning Based Question Classification (自然言語処理特集号「質疑応答,自動要約」) , 2005 .

[31]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[32]  Claudio Giuliano,et al.  A semi-supervised approach to question classification , 2009, ESANN.

[33]  Andreas Merkel,et al.  Language Model Based Query Classification , 2007, ECIR.

[34]  Ellen M. Voorhees,et al.  Overview of the TREC 2002 Question Answering Track , 2003, TREC.

[35]  Stanford University lere High-Performance Question Classification Using Semantic Features , 2010 .

[36]  Dan Roth,et al.  A Sequential Model for Multi-Class Classification , 2001, EMNLP.

[37]  Ann Bies,et al.  Bracketing Guidelines for Treebank II Style , 2002 .

[38]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[39]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[40]  Soumen Chakrabarti,et al.  Enhanced Answer Type Inference from Questions using Sequential Models , 2005, HLT/EMNLP.

[41]  James H. Martin,et al.  Speech and Language Processing, 2nd Edition , 2008 .

[42]  Adwait Ratnaparkhi,et al.  IBM's Statistical Question Answering System , 2000, TREC.

[43]  Eduard Hovy,et al.  Automated question answering in Webclopedia: a demonstration , 2002 .

[44]  Sanda M. Harabagiu,et al.  Performance Issues and Error Analysis in an Open-Domain Question Answering System , 2002, ACL.

[45]  William A. Woods,et al.  Progress in natural language understanding: an application to lunar geology , 1973, AFIPS National Computer Conference.

[46]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[47]  Santosh Kumar Ray,et al.  A semantic approach for question classification using WordNet and Wikipedia , 2010, Pattern Recognit. Lett..

[48]  Oren Etzioni,et al.  Towards a theory of natural language interfaces to databases , 2003, IUI '03.

[49]  Dan Klein,et al.  Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[50]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[51]  Hinrich Schütze,et al.  Part-of-Speech Tagging Using a Variable Memory Markov Model , 1994, ACL.

[52]  Bert F. Green,et al.  Baseball: an automatic question-answerer , 1899, IRE-AIEE-ACM '61 (Western).

[53]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[54]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[55]  Donna K. Harman,et al.  Overview of the Eighth Text REtrieval Conference (TREC-8) , 1999, TREC.

[56]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[57]  James R. Curran,et al.  Question classification with log-linear models , 2006, SIGIR.

[58]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[59]  Eduard H. Hovy,et al.  Toward Semantics-Based Answer Pinpointing , 2001, HLT.

[60]  David A. Hull Xerox TREC-8 Question Answering Track Report , 1999, TREC.

[61]  Le Minh Nguyen,et al.  Using Semi-supervised Learning for Question Classification , 2008 .

[62]  Robert F. Simmons,et al.  Answering English questions by computer: a survey , 1965, CACM.

[63]  Krystle Kocik,et al.  Question Classification using Maximum Entropy Models , 2004 .