A Classification of Questions Using SVM and Semantic Similarity Analysis

Question classification is an important part in the question answering system. The results of the question classification determine the quality of the question answering system. In this paper, a question classification algorithm based on SVM and question semantic similarity is proposed, it is applied in a real-world on-line interactive question answering system in tourism domain. In the two level question classification method, Support Vector Machine model is adopted to train a classifier on coarse categories, question semantic similarity model is used to classify the question into sub-categories. The use of concept of domain terms construction will improve the feature expression of Support Vector Machine and question semantic similarity. The experimental result show that the accuracy of the classification algorithm is up to 91.49%.

[1]  Qun Liu,et al.  Chinese Lexical Analysis Using Hierarchical Hidden Markov Model , 2003, SIGHAN.

[2]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[3]  Ling Xia,et al.  Question classification in chinese restricted-domain based on SVM and domain dictionary , 2008, 2008 International Conference on Natural Language Processing and Knowledge Engineering.

[4]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[5]  Fuji Ren,et al.  Exploiting syntactic and semantic information in coarse chinese question classification , 2008, 2008 International Conference on Natural Language Processing and Knowledge Engineering.

[6]  Jinzhong Xu,et al.  Domain Ontology Learning for Question Answering System in Network Education , 2008, 2008 The 9th International Conference for Young Computer Scientists.