论文信息 - A Bi-LSTM memory network for end-to-end goal-oriented dialog learning

A Bi-LSTM memory network for end-to-end goal-oriented dialog learning

Abstract We develop a model to satisfy the requirements of Dialog System Technology Challenge 6 (DSTC6) Track 1: building an end-to-end dialog systems for goal-oriented applications. This task involves learning a dialog policy from transactional dialogs in a given domain. Automatic system responses are generated using given task-oriented dialog data ( http://workshop.colips.org/dstc6/index.html ). As this task has a similar structure to a question answering task (Weston et al., 2015), we employ the MemN2N architecture (Sukhbaatar et al., 2015), which outperforms models based on recurrent neural networks or long short-term memory (LSTM). However, two problems arise when applying this model to the DSTC6 task. First, we encounter an out-of-vocabulary problem, which we resolve by categorizing the metadata types of words that exist in the knowledge base; the metadata is similar to the named entity. Second, the original memory network model has a weak ability to reflect sufficient temporal information, because it only uses sentence-level embeddings. Therefore, we add bidirectional LSTM (Bi-LSTM) at the beginning of the model to better reflect temporal information. The experimental results demonstrate that our model reflects temporal features well. Furthermore, our model achieves state-of-the-art performance among the memory networks, and is comparable to hybrid code networks (Ham et al., 2017) and hierarchical LSTM model (Bai et al., 2017) which is not an end-to-end architecture.

[1] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[2] Richard Socher,et al. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[3] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[4] Richard Socher,et al. Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.

[5] Jürgen Schmidhuber,et al. Training Very Deep Networks , 2015, NIPS.