Semantic Chunk Annotation for complex questions using Conditional Random Field

This paper presents a CRF (Conditional Random Field) model for Semantic Chunk Annotation in a Chinese Question and Answering System (SCACQA). The model was derived from a corpus of real world questions, which are collected from some discussion groups on the Internet. The questions are supposed to be answered by other people, so some of the questions are very complex. Mutual information was adopted for feature selection. The training data collection consists of 14000 sentences and the testing data collection consists of 4000 sentences. The result shows an F-score of 93.07%.

[1]  Noriko Tomuro,et al.  Interrogative Reformulation Patterns and Acquisition of Question Paraphrases , 2003, IWP@ACL.

[2]  Shih-Hung Wu,et al.  An integrated knowledge-based and machine learning approach for Chinese question classification , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[3]  Myung-Gil Jang,et al.  Descriptive Question Answering in Encyclopedia , 2005, ACL.

[4]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[5]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[6]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[7]  W. Bruce Croft,et al.  Finding similar questions in large question and answer archives , 2005, CIKM '05.

[8]  Ulf Hermjakob,et al.  Parsing and Question Classification for Question Answering , 2001, ACL 2001.

[9]  Chong-Ho Choi,et al.  Input feature selection for classification problems , 2002, IEEE Trans. Neural Networks.

[10]  Guy Lapalme,et al.  The QUANTUM Question Answering System , 2001, TREC.

[11]  Sanda M. Harabagiu,et al.  Answering complex questions with random walk models , 2006, SIGIR '06.

[12]  Eric Horvitz,et al.  Using Machine Learning Techniques to Interpret WH-questions , 2001, ACL.

[13]  Diego Molla Aliod,et al.  Question Answering in Restricted Domains: An Overview , 2007, CL.

[14]  Eric Brill,et al.  Automatic Question Answering: Beyond the Factoid , 2004, NAACL.

[15]  Valentin Jijkoun,et al.  Retrieving answers from frequently asked questions pages on the web , 2005, CIKM '05.