Question Classification in English-Chinese Cross-Language Question Answering: An Integrated Genetic Algorithm and Machine Learning Approach

Question classification plays an important role in cross-language question answering (CLQA) systems, while question Informer plays a key role in enhancing question classification for factual question answering. In this paper, we propose an integrated genetic algorithm (GA) and machine learning (ML) approach for question classification in English-Chinese cross-language question answering. To enhance question informer prediction, we use a hybrid method that integrates GA and conditional random fields (CRF) to optimize feature subset selection in a CRF-based question informer prediction model. The proposed approach extends cross-language question classification by using the GA-CRF question informer feature with support vector machines (SVM). The results of evaluations on the NTCIR-6 CLQA question sets demonstrate the efficacy of the approach in improving the accuracy of question classification in English-Chinese cross-language question answering.

[1]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[2]  Shih-Hung Wu,et al.  Integrating Genetic Algorithms with Conditional Random Fields to Enhance Question Informer Prediction , 2006, 2006 IEEE International Conference on Information Reuse & Integration.

[3]  Ellen M. Voorhees,et al.  The fourteenth text retrieval conference TREC 2005 , 2006 .

[4]  Valentin Jijkoun,et al.  Overview of the CLEF 2006 Multilingual Question Answering Track , 2006, CLEF.

[5]  Hsin-Hsi Chen,et al.  Overview of the NTCIR-5 Cross-Lingual Question Answering Task (CLQA1) , 2005, NTCIR.

[6]  Manuel Montes-y-Gómez,et al.  Question Classification in Spanish and Portuguese , 2005, CICLing.

[7]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[8]  Jason D. M. Rennie,et al.  Improving Multiclass Text Classification with the Support Vector Machine , 2001 .

[9]  George F. Foster,et al.  Quantum, a French/English Cross-Language Question Answering System , 2003, CLEF.

[10]  Olga Feiguina Learning to Classify Questions , 2005 .

[11]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Novelty Track. , 2005 .

[12]  Shih-Hung Wu,et al.  An integrated knowledge-based and machine learning approach for Chinese question classification , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[13]  Shih-Hung Wu,et al.  ASQA: Academia Sinica Question Answering System for NTCIR-5 CLQA , 2005, NTCIR.

[14]  W. Bruce Croft,et al.  Analysis of Statistical Question Classification for Fact-Based Questions , 2005, Information Retrieval.

[15]  Jun Suzuki,et al.  Question Classification using HDAG Kernel , 2003, ACL 2003.

[16]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[17]  Wen-Lian Hsu,et al.  Chinese-Chinese and English-Chinese Question Answering with ASQA at NTCIR-6 CLQA , 2007, NTCIR.

[18]  Kui-Lam Kwok,et al.  Chinese Question-Answering: Comparing Monolingual with English-Chinese Cross-Lingual Results , 2006, AIRS.

[19]  Soumen Chakrabarti,et al.  Enhanced Answer Type Inference from Questions using Sequential Models , 2005, HLT/EMNLP.