Speaker-change Aware CRF for Dialogue Act Classification

Recent work in Dialogue Act (DA) classification approaches the task as a sequence labeling problem, using neural network models coupled with a Conditional Random Field (CRF) as the last layer. CRF models the conditional probability of the target DA label sequence given the input utterance sequence. However, the task involves another important input sequence, that of speakers, which is ignored by previous work. To address this limitation, this paper proposes a simple modification of the CRF layer that takes speaker-change into account. Experiments on the SwDA corpus show that our modified CRF layer outperforms the original one, with very wide margins for some DA labels. Further, visualizations demonstrate that our CRF layer can learn meaningful, sophisticated transition patterns between DA label pairs conditioned on speaker-change in an end-to-end way. Code is publicly available.

[1]  Ryuichiro Higashinaka,et al.  Towards an open-domain conversational system fully based on natural language processing , 2014, COLING.

[2]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[3]  Jean-Pierre Lorré,et al.  LinTO : Assistant vocal open-source respectueux des données personnelles pour les réunions d'entreprise , 2019, ArXiv.

[4]  A. Koller,et al.  Speech Acts: An Essay in the Philosophy of Language , 1969 .

[5]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[6]  Xiao Li,et al.  A Dual-Attention Hierarchical Recurrent Neural Network for Dialogue Act Classification , 2018, CoNLL.

[7]  Hongfei Lin,et al.  An attention‐based BiLSTM‐CRF approach to document‐level chemical named entity recognition , 2018, Bioinform..

[8]  Phil Blunsom,et al.  Recurrent Convolutional Neural Networks for Discourse Compositionality , 2013, CVSM@ACL.

[9]  Wei Li,et al.  Multi-level Gated Recurrent Neural Network for dialog act classification , 2016, COLING.

[10]  Jamin Shin,et al.  Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition , 2019, EMNLP/IJCNLP.

[11]  Gina-Anne Levow,et al.  Dialog act tagging with support vector machines and hidden Markov models , 2006, INTERSPEECH.

[12]  Jean-Pierre Lorré,et al.  Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization , 2018, ACL.

[13]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[14]  Deng Cai,et al.  Dialogue Act Recognition via CRF-Attentive Structured Network , 2017, SIGIR.

[15]  Andreas Stolcke,et al.  Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.

[16]  Wen Wang,et al.  BERT for Joint Intent Classification and Slot Filling , 2019, ArXiv.

[17]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[18]  Xipeng Qiu,et al.  TENER: Adapting Transformer Encoder for Named Entity Recognition , 2019, ArXiv.

[19]  David Vilar,et al.  Dialogue act classification using a Bayesian approach ∗ , 2004 .

[20]  Yun Lei,et al.  Using Context Information for Dialog Act Classification in DNN Framework , 2017, EMNLP.

[21]  Elizabeth Shriberg,et al.  Automatic dialog act segmentation and classification in multiparty meetings , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[22]  Jan Alexanderssony,et al.  Dialogue acts in VERBMOBIL-2 , 1997 .

[23]  Matthias Zimmermann,et al.  Joint segmentation and classification of dialog acts using conditional random fields , 2009, INTERSPEECH.

[24]  Eric Fosler-Lussier,et al.  Combining phonetic attributes using conditional random fields , 2006, INTERSPEECH.

[25]  Elizabeth Shriberg,et al.  Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual , 1997 .

[26]  James F. Allen,et al.  Draft of DAMSL Dialog Act Markup in Several Layers , 2007 .

[27]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[28]  Klaus Ries,et al.  HMM and neural network based speech act detection , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[29]  Mark G. Core,et al.  Coding Dialogs with the DAMSL Annotation Scheme , 1997 .

[30]  Kôiti Hasida,et al.  ISO 24617-2: A semantically-based standard for dialogue annotation , 2012, LREC.

[31]  Yorick Wilks,et al.  Dialogue Act Classification Based on Intra-Utterance Features∗ , 2005 .

[32]  F. Inglis How To Do Things With Words. , 1971 .

[33]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[34]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[35]  Timothy Baldwin,et al.  Classifying Dialogue Acts in One-on-One Live Chats , 2010, EMNLP.

[36]  Kôiti Hasida,et al.  Towards an ISO Standard for Dialogue Act Annotation , 2010, LREC.

[37]  Harry Bunt,et al.  'Who's next? Speaker-selection mechanisms in multiparty dialogue' , 2009 .

[38]  Franck Dernoncourt,et al.  Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks , 2016, NAACL.

[39]  Joel R. Tetreault,et al.  Dialogue Act Classification with Context-Aware Self-Attention , 2019, NAACL.

[40]  Iryna Gurevych,et al.  Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks , 2017, ArXiv.

[41]  Rodney D. Nielsen,et al.  Dialogue Act Classification in Domain-Independent Conversations Using a Deep Recurrent Neural Network , 2016, COLING.

[42]  Hwaran Lee,et al.  Compositional Sentence Representation from Character Within Large Context Text , 2016, ICONIP.

[43]  Piroska Lendvai,et al.  Token-based Chunking of Turn-internal Dialogue Act Sequences , 2007, SIGDIAL.

[44]  Ali Ahmadvand,et al.  Contextual Dialogue Act Classification for Open-Domain Conversational Agents , 2019, SIGIR.

[45]  Ngoc Thang Vu,et al.  Neural-based Context Representation Learning for Dialog Act Classification , 2017, SIGDIAL Conference.

[46]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[47]  Jean-Pierre Lorré,et al.  Energy-based Self-attentive Learning of Abstractive Communities for Spoken Language Understanding , 2020, AACL/IJCNLP.

[48]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[49]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[50]  Andreas Stolcke,et al.  Does active learning help automatic dialog act tagging in meeting data? , 2005, INTERSPEECH.

[51]  Csr Young,et al.  How to Do Things With Words , 2009 .

[52]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[53]  Quan Hung Tran,et al.  A Hierarchical Neural Model for Learning Sequences of Dialogue Acts , 2017, EACL.

[54]  Shafiq R. Joty,et al.  Dialogue Act Recognition in Synchronous and Asynchronous Conversations , 2013, SIGDIAL Conference.

[55]  Alessandro Moschitti,et al.  Transfer Learning for Sequence Labeling Using Source Model and Target Data , 2019, AAAI.

[56]  Yang Liu Using SVM and error-correcting codes for multiclass dialog act classification in meeting corpus , 2006, INTERSPEECH.

[57]  Yue Zhang,et al.  Hierarchically-Refined Label Attention Network for Sequence Labeling , 2019, EMNLP.

[58]  Yue Zhang,et al.  Design Challenges and Misconceptions in Neural Sequence Labeling , 2018, COLING.

[59]  Houfeng Wang,et al.  Using Bidirectional Transformer-CRF for Spoken Language Understanding , 2019, NLPCC.

[60]  Harshit Kumar,et al.  Dialogue Act Sequence Labeling using Hierarchical encoder with CRF , 2017, AAAI.

[61]  Michalis Vazirgiannis,et al.  Real-Time Keyword Extraction from Conversations , 2017, EACL.

[62]  E. Schegloff,et al.  A simplest systematics for the organization of turn-taking for conversation , 1974 .

[63]  Stefan Wermter,et al.  A Context-based Approach for Dialogue Act Recognition using Simple Recurrent Neural Networks , 2018, LREC.

[64]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[65]  Matthew Purver,et al.  Investigating the Contribution of Distributional Semantic Information for Dialogue Act Classification , 2014, CVSC@EACL.

[66]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[67]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[68]  Hung-yi Lee,et al.  Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection , 2016, INTERSPEECH.