Abstract We develop a model to satisfy the requirements of Dialog System Technology Challenge 6 (DSTC6) Track 1: building an end-to-end dialog systems for goal-oriented applications. This task involves learning a dialog policy from transactional dialogs in a given domain. Automatic system responses are generated using given task-oriented dialog data ( http://workshop.colips.org/dstc6/index.html ). As this task has a similar structure to a question answering task (Weston et al., 2015), we employ the MemN2N architecture (Sukhbaatar et al., 2015), which outperforms models based on recurrent neural networks or long short-term memory (LSTM). However, two problems arise when applying this model to the DSTC6 task. First, we encounter an out-of-vocabulary problem, which we resolve by categorizing the metadata types of words that exist in the knowledge base; the metadata is similar to the named entity. Second, the original memory network model has a weak ability to reflect sufficient temporal information, because it only uses sentence-level embeddings. Therefore, we add bidirectional LSTM (Bi-LSTM) at the beginning of the model to better reflect temporal information. The experimental results demonstrate that our model reflects temporal features well. Furthermore, our model achieves state-of-the-art performance among the memory networks, and is comparable to hybrid code networks (Ham et al., 2017) and hierarchical LSTM model (Bai et al., 2017) which is not an end-to-end architecture.
[1]
Jason Weston,et al.
End-To-End Memory Networks
,
2015,
NIPS.
[2]
Richard Socher,et al.
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
,
2015,
ICML.
[3]
Jeffrey Pennington,et al.
GloVe: Global Vectors for Word Representation
,
2014,
EMNLP.
[4]
Richard Socher,et al.
Dynamic Memory Networks for Visual and Textual Question Answering
,
2016,
ICML.
[5]
Jürgen Schmidhuber,et al.
Training Very Deep Networks
,
2015,
NIPS.