Automatic Construction of Semantic Dictionary for Question Categorization

An automatic method for building a semantic dictionary from existing questions in a pattern-based question answering system is proposed for question categorization. This dictionary consists of two main parts: Semantic Domain Terms (SDT), which is a domain specific term list, and Semantic Labeled Terms (SLT), which contain common terms tagged with semantic labels. The semantic dictionary is built using the proposed method on a set of 2509 questions with semantic patterns in our system. 3390 questions without semantic patterns are used as ground truth to test its performance. Experimental results show that the precision of question classification is improved by 7.5% in average after using the constructed semantic dictionary compared with the baseline method.

[1]  Anette Hulth,et al.  A Study on Automatically Extracted Keywords in Text Categorization , 2006, ACL.

[2]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[3]  Fakhri Karray,et al.  A concept-based model for enhancing text categorization , 2007, KDD '07.

[4]  Fakhri Karray,et al.  Enhancing Text Retrieval Performance using Conceptual Ontological Graph , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[5]  Stefan Wermter,et al.  Selforganizing Classification on the Reuters News Corpus , 2002, COLING.

[6]  Qingtian Zeng,et al.  Semantic patterns for user‐interactive question answering , 2008, Concurr. Comput. Pract. Exp..

[7]  Qingtian Zeng,et al.  Semantic Pattern for User-Interactive Question Answering , 2006, SKG.

[8]  Tianyong Hao,et al.  Automatic Generation of Semantic Patterns for User-Interactive Question Answering , 2008, AIRS.

[9]  Ian H. Witten,et al.  Mining Domain-Specific Thesauri from Wikipedia: A Case Study , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[10]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[11]  Xiaoli Li,et al.  A refinement approach to handling model misfit in text categorization , 2002, KDD.

[12]  Fakhri Karray,et al.  Enhancing Text Clustering Using Concept-based Mining Model , 2006, Sixth International Conference on Data Mining (ICDM'06).

[13]  Sergei Nirenburg,et al.  Automatic Question Answering , 2000, RIAO.

[14]  Jun Suzuki,et al.  Question Classification using HDAG Kernel , 2003, ACL 2003.