Improved Dynamic Memory Network for Dialogue Act Classification with Adversarial Training

Dialogue Act (DA) classification is a challenging problem in dialogue interpretation, which aims to attach semantic labels to utterances and characterize the speaker’s intention. Currently, many existing approaches formulate the DA classification problem ranging from multi-classification to structured prediction, which suffer from two limitations: a) these methods are either handcrafted feature-based or have limited memories. b) adversarial examples can’t be correctly classified by traditional training methods. To address these issues, in this paper we first cast the problem into a question and answering problem and proposed an improved dynamic memory networks with hierarchical pyramidal utterance encoder. Moreover, we apply adversarial training to train our proposed model. We evaluate our model on two public datasets, i.e., Switchboard dialogue act corpus and the MapTask corpus. Extensive experiments show that our proposed model is not only robust, but also achieves better performance when compared with some state-of-the-art baselines.

[1]  Kam-Fai Wong,et al.  Towards Neural Network-based Reasoning , 2015, ArXiv.

[2]  Richard Socher,et al.  Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.

[3]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[4]  David Traum,et al.  Speech Acts for Dialogue Agents , 1999 .

[5]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Jiebo Luo,et al.  Image Captioning with Semantic Attention , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  Norbert Reithinger,et al.  Dialogue act classification using language models , 1997, EUROSPEECH.

[10]  Philip S. Yu,et al.  Improving Automatic Source Code Summarization via Deep Reinforcement Learning , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[11]  Rodney D. Nielsen,et al.  Dialogue Act Classification in Domain-Independent Conversations Using a Deep Recurrent Neural Network , 2016, COLING.

[12]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[13]  Atiq Islam,et al.  Dialog Act Classification Using Acoustic and Discourse Information of Maptask Data , 2010, Int. J. Comput. Intell. Appl..

[14]  Ingrid Zukerman,et al.  Preserving Distributional Information in Dialogue Act Classification , 2017, EMNLP.

[15]  Anne H. Anderson,et al.  The Hcrc Map Task Corpus , 1991 .

[16]  Daniel Jurafsky,et al.  Data Noising as Smoothing in Neural Network Language Models , 2017, ICLR.

[17]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[18]  Gholamreza Haffari,et al.  A Latent Variable Recurrent Neural Network for Discourse Relation Language Models , 2016, ArXiv.

[19]  Philip S. Yu,et al.  Not Just Privacy: Improving Performance of Private Deep Learning in Mobile Cloud , 2018, KDD.

[20]  Gholamreza Haffari,et al.  A Latent Variable Recurrent Neural Network for Discourse Relation Language Models , 2016 .

[21]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[24]  Andrew M. Dai,et al.  Adversarial Training Methods for Semi-Supervised Text Classification , 2016, ICLR.

[25]  Deng Cai,et al.  Dialogue Act Recognition via CRF-Attentive Structured Network , 2017, SIGIR.

[26]  Kristy Elizabeth Boyer,et al.  Understanding Student Language: An Unsupervised Dialogue Act Classification Approach , 2015, EDM 2015.

[27]  Yun Lei,et al.  Using Context Information for Dialog Act Classification in DNN Framework , 2017, EMNLP.

[28]  Timothy Baldwin,et al.  Robust Training under Linguistic Adversity , 2017, EACL.

[29]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[30]  Andrew M. Dai,et al.  Virtual Adversarial Training for Semi-Supervised Text Classification , 2016, ArXiv.

[31]  Gina-Anne Levow,et al.  Dialog act tagging with support vector machines and hidden Markov models , 2006, INTERSPEECH.

[32]  Yueting Zhuang,et al.  Video Question Answering via Hierarchical Spatio-Temporal Attention Networks , 2017, IJCAI.

[33]  Quoc V. Le,et al.  Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[34]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[35]  Harry Bunt,et al.  A Multidimensional Approach to Utterance Segmentation and Dialogue Act Classification , 2007, SIGDIAL.

[36]  David Bamman,et al.  Adversarial Training for Relation Extraction , 2017, EMNLP.

[37]  Franck Dernoncourt,et al.  Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks , 2016, NAACL.

[38]  Norbert Reithinger,et al.  Predicting dialogue acts for a speech-to-speech translation system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[39]  Harshit Kumar,et al.  Dialogue Act Sequence Labeling using Hierarchical encoder with CRF , 2017, AAAI.

[40]  Ingrid Zukerman,et al.  A Generative Attentional Neural Network Model for Dialogue Act Classification , 2017, ACL.

[41]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[42]  Andreas Stolcke,et al.  Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.

[43]  Deng Cai,et al.  MEMEN: Multi-layer Embedding with Memory Networks for Machine Comprehension , 2017, ArXiv.

[44]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[45]  Phil Blunsom,et al.  Recurrent Convolutional Neural Networks for Discourse Compositionality , 2013, CVSM@ACL.

[46]  Philip S. Yu,et al.  Layerwise Perturbation-Based Adversarial Training for Hard Drive Health Degree Prediction , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[47]  David Vilar,et al.  Dialogue act classification using a Bayesian approach ∗ , 2004 .

[48]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.