An Effective Feature-Weighting Model for Question Classification

Question classification is one of the most important sub- tasks in Question Answering systems. Now question tax- onomy is getting larger and more fine-grained for better answer generation. Many approaches to question classifi- cation have been proposed and achieve reasonable results. However, all previous approaches use certain learning al- gorithm to learn a classifier from binary feature vectors, extracted from small size of labeled examples. In this pa- per we propose a feature-weighting model which assigns different weights to features instead of simple binary val- ues. The main characteristic of this model is assigning more reasonable weight to features: these weights can be used to differentiate features each other according to their contri- bution to question classification. Furthermore, features are weighted depending on not only small labeled question col- lection but also large unlabeled question collection. Exper- imental results show that with this new feature-weighting model the SVM-based classifier outperforms the one with- out it to some extent.

[1]  Sanda M. Harabagiu,et al.  High performance question/answering , 2001, SIGIR '01.

[2]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Tsutomu Hirao,et al.  NTT's QA Systems for NTCIR QAC-1 , 2002, NTCIR.

[4]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[5]  Manuel Montes-y-Gómez,et al.  A Language Independent Method for Question Classification , 2004, COLING.

[6]  Christian Micheloni,et al.  Video security for ambient intelligence , 2005, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[7]  Zhalaing Cheung,et al.  Feature Extraction for Learning to Classify Questions , 2004, Australian Conference on Artificial Intelligence.

[8]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Novelty Track. , 2005 .

[9]  Greg Hamerly,et al.  Learning the k in k-means , 2003, NIPS.

[10]  Susan T. Dumais,et al.  Improving the retrieval of information from external sources , 1991 .

[11]  Michel Desvignes,et al.  Finding People in Video Streams by Statistical Modeling , 2005, ICAPR.

[12]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[13]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[14]  José Luis Vicedo González,et al.  Automatic Feature Extraction for Question Classification Based on Dissimilarity of Probability Distributions , 2006, FinTAL.

[15]  Qun Liu,et al.  Semantic computation in a Chinese Question-Answering system , 2002, Journal of Computer Science and Technology.

[16]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[17]  Ramakant Nevatia,et al.  Tracking multiple humans in complex situations , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Azriel Rosenfeld,et al.  Tracking Groups of People , 2000, Comput. Vis. Image Underst..

[19]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.