Part-of-speech and postion attention mechanism based BLSTM for question answering system

Attention based bidirectional long short-term memory networks have been increasingly concerned and widely used in Natural Language Processing tasks. Motivated by the performance of attention mechanism, various attentive models have been proposed to prompt the effectiveness of question answering. However, there are few researches that have focused on the impact of positional information on question answering, which has been proved effective in information retrieval. In this paper, we assume that if a word appears both in the question sentence and answer sentence, words close to it should be paid more attention to, since they are more likely to contain potential valuable information for the question. Moreover, there also has few researches that consider part-of-speech into question answering. We argue that words except nouns, verbs and pronouns tend to contain less useful information than nouns, verbs and pronouns, so that we can neglect the positional impact of them. Based on both assumptions above, we propose a part-of-speech and position attention mechanism based bidirectional long short-term memory networks for question answering system, abbreviated in DPOS-ATT-BLSTM, which cooperates with traditional attention mechanism to obtain attentive answer representations. We experiment on the Chinese medicinal dataset collected from the http://www.xywy.com/ and http://www.haodf.com/, and comparative experiments are made comparing with methods based on traditional attention mechanism. The experimental results demonstrate the good performance and efficiency of our proposed model.

[1]  Bowen Zhou,et al.  Attentive Pooling Networks , 2016, ArXiv.

[2]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Di Wang,et al.  A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering , 2015, ACL.

[5]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[6]  Ben He,et al.  CRTER: using cross terms to enhance probabilistic information retrieval , 2011, SIGIR '11.

[7]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[8]  Dragan Bojic,et al.  Using part-of-speech tags as deep-syntax indicators in determining short-text semantic similarity , 2015, Comput. Sci. Inf. Syst..

[9]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[10]  Bowen Zhou,et al.  LSTM-based Deep Learning Models for non-factoid answer selection , 2015, ArXiv.

[11]  ChengXiang Zhai,et al.  Positional language models for information retrieval , 2009, SIGIR.

[12]  Alistair Moffat,et al.  Effective document presentation with a locality-based similarity heuristic , 1999, SIGIR '99.

[13]  Qinmin Hu,et al.  Enhancing Recurrent Neural Networks with Positional Attention for Question Answering , 2017, SIGIR.

[14]  Jun Zhao,et al.  Inner Attention based Recurrent Neural Networks for Answer Selection , 2016, ACL.

[15]  Zhipeng Xie Enhancing Document-Based Question Answering via Interaction Between Question Words and POS Tags , 2017, NLPCC.