论文信息 - Global-Locally Self-Attentive Encoder for Dialogue State Tracking

Global-Locally Self-Attentive Encoder for Dialogue State Tracking

Dialogue state tracking, which estimates user goals and requests given the dialogue context, is an essential part of task-oriented dialogue systems. In this paper, we propose the Global-Locally Self-Attentive Dialogue State Tracker (GLAD), which learns representations of the user utterance and previous system actions with global-local modules. Our model uses global modules to shares parameters between estimators for different types (called slots) of dialogue states, and uses local modules to learn slot-specific features. We show that this significantly improves tracking of rare states. GLAD obtains 88.3% joint goal accuracy and 96.4% request accuracy on the WoZ state tracking task, outperforming prior work by 3.9% and 4.8%. On the DSTC2 task, our model obtains 74.7% joint goal accuracy and 97.3% request accuracy, outperforming prior work by 1.3% and 0.8%

[1] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[2] Kevin Gimpel,et al. From Paraphrase Database to Compositional Paraphrase Model and Back , 2015, Transactions of the Association for Computational Linguistics.

[3] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[4] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[5] Jason D. Williams,et al. Web-style ranking and SLU combination for dialog state tracking , 2014, SIGDIAL Conference.

[6] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.

[7] Chris Callison-Burch,et al. PPDB: The Paraphrase Database , 2013, NAACL.

[8] Martin Wattenberg,et al. Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[9] Anton Schwaighofer,et al. Learning Gaussian processes from multiple tasks , 2005, ICML.

[10] Yee Whye Teh,et al. Semiparametric latent factor models , 2005, AISTATS.

[11] Mirella Lapata,et al. Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[12] Antoine Raux,et al. The Dialog State Tracking Challenge , 2013, SIGDIAL Conference.

[13] Oliver Lemon,et al. A Simple and Generic Belief Tracking Mechanism for the Dialog State Tracking Challenge: On the believability of observed information , 2013, SIGDIAL Conference.

[14] Matthew Henderson,et al. Word-Based Dialog State Tracking with Recurrent Neural Networks , 2014, SIGDIAL Conference.

[15] Edwin V. Bonilla,et al. Multi-task Gaussian Process Prediction , 2007, NIPS.

[16] Neil D. Lawrence,et al. Learning to learn with the informative vector machine , 2004, ICML.

[17] Lukasz Kaiser,et al. One Model To Learn Them All , 2017, ArXiv.

[18] Tsung-Hsien Wen,et al. Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[19] Matthew Henderson,et al. The Second Dialog State Tracking Challenge , 2014, SIGDIAL Conference.

[20] Steve J. Young,et al. Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems , 2010, Comput. Speech Lang..

[21] Filip Jurcícek,et al. Incremental LSTM-based dialog state tracker , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[22] Richard Socher,et al. Dynamic Coattention Networks For Question Answering , 2016, ICLR.

[23] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[24] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[25] Richard Socher,et al. DCN+: Mixed Objective and Deep Residual Coattention for Question Answering , 2017, ICLR.

[26] Milica Gasic,et al. POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[27] Steve J. Young,et al. Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[28] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[29] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[31] Yoshimasa Tsuruoka,et al. A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks , 2016, EMNLP.

[32] Luke S. Zettlemoyer,et al. Deep Semantic Role Labeling: What Works and What’s Next , 2017, ACL.

[33] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[34] Matthew Henderson,et al. Discriminative spoken language understanding using word confusion networks , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[35] Fei Liu,et al. Dialog state tracking, a machine reading approach using Memory Network , 2016, EACL.

[36] Ali Farhadi,et al. Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[37] Sebastian Thrun,et al. Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.

[38] David Vandyke,et al. A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[39] David Vandyke,et al. Multi-domain Dialog State Tracking using Recurrent Neural Networks , 2015, ACL.

[40] Luke S. Zettlemoyer,et al. End-to-end Neural Coreference Resolution , 2017, EMNLP.

[41] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.