ASQA: Academia Sinica Question Answering System for NTCIR-5 CLQA

We propose a hybrid architecture for the NTCIR-5 CLQA C-C (Cross Language Question Answering from Chinese to Chinese) Task. Our system, the Academia Sinica Question-Answering System (ASQA), outputs exact answers to six types of factoid question: personal names, location names, organization names, artifacts, times, and numbers. The architecture of ASQA comprises four main components: Question Processing, Passage Retrieval, Answer Extraction, and Answer Ranking. ASQA successfully combines machine learning and knowledge-based approaches to answer Chinese factoid questions, achieving 37.5% and 44.5% Top1 accuracy for correct, and correct+unsupported answers, respectively.

[1]  Shih-Hung Wu,et al.  Event identification based on the information map-INFOMAP , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).

[2]  Keh-Jiann Chen,et al.  Design of Chinese Morphological Analyzer , 2002, SIGHAN@COLING.

[3]  Shih-Hung Wu,et al.  An integrated knowledge-based and machine learning approach for Chinese question classification , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[4]  Shih-Hung Wu,et al.  Mencius: A Chinese Named Entity Recognizer Using the Maximum Entropy-based Hybrid Model , 2004, Int. J. Comput. Linguistics Chin. Lang. Process..

[5]  Sanda M. Harabagiu,et al.  LASSO: A Tool for Surfing the Answer Net , 1999, TREC.

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.