A template alignment algorithm for question classification

Question classification (QC) plays a key role in automated question answering (QA) systems. In Chinese QC, for example, a question is analyzed and then labeled with the question type it belongs to and the expected answer type. In this paper, we propose a novel method of Chinese QC that integrates syntactic tags and semantic tags into an alignment-based approach. We adopt a template alignment (TA) algorithm to process large collections of Chinese questions and compare the classification results with those of INFOMAP, a human annotated knowledge inference engine for Chinese questions. We experimented with two approaches for the proposed system: a majority algorithm and a machine learning method that uses Support Vector Machine (SVM). The TA algorithm performs well with both approaches. The experimental results show that the accuracy achieved by TA (85.5%) is comparable to that of INFOMAP (88%). In contrast, QC based on the SVM approach, which incorporates syntactic features and TA yields an accuracy rate of 91.5%.

[1]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[2]  Shih-Hung Wu,et al.  An integrated knowledge-based and machine learning approach for Chinese question classification , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[3]  W. Bruce Croft,et al.  Analysis of Statistical Question Classification for Fact-Based Questions , 2005, Information Retrieval.

[4]  Sanda M. Harabagiu,et al.  Performance issues and error analysis in an open-domain question answering system , 2003, TOIS.

[5]  Shih-Hung Wu,et al.  Event identification based on the information map-INFOMAP , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).

[6]  Hsin-Hsi Chen,et al.  Overview of the NTCIR-6 Cross-Lingual Question Answering (CLQA) Task , 2007, NTCIR.