History question classification and representation for Chinese Gaokao

In this paper, we propose a question representation based on entity labeling and question classification for a automatic question answering system of Chinese Gaokao history question. A CRF model is used for the entity labeling and SVM/ CNN/LSTM models are tested for question classification. Our experiments show that CRF model provides a high performance when used to label informative entities out while neural networks has a promising performance for the question classification task. With both entity labeling and question classification models of high performance, we can provide the KB-based question answering system with a question representation of high reliability. Then the question answering system can do more good work depending on the key information our models provide.

[1]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[2]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[3]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[4]  Andrew McCallum,et al.  Information Extraction with HMMs and Shrinkage , 1999 .

[5]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[6]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[7]  2016 International Conference on Asian Language Processing, IALP 2016, Tainan, Taiwan, November 21-23, 2016 , 2016, IALP.

[8]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[9]  Yue Zhang,et al.  Target-Dependent Twitter Sentiment Classification with Rich Automatic Features , 2015, IJCAI.

[10]  Ngoc Thang Vu,et al.  Combining Recurrent and Convolutional Neural Networks for Relation Classification , 2016, NAACL.

[11]  Arun D Panicker,et al.  Question Classification using Machine Learning Approaches , 2012 .

[12]  Ai Kawazoe,et al.  Overview of Todai Robot Project and Evaluation Framework of its NLP-based Problem Solving , 2014, LREC.

[13]  W. Bruce Croft,et al.  Analysis of Statistical Question Classification for Fact-Based Questions , 2005, Information Retrieval.

[14]  Yong Yu,et al.  Searching Questions by Identifying Question Topic and Question Focus , 2008, ACL.

[15]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..