End-to-End Multi-Level Dialog Act Recognition

The three-level dialog act annotation scheme of the DIHANA corpus poses a multi-level classification problem in which the bottom levels allow multiple or no labels for a single segment. We approach automatic dialog act recognition on the three levels using an end-to-end approach, in order to implicitly capture relations between them. Our deep neural network classifier uses a combination of wordand character-based segment representation approaches, together with a summary of the dialog history and information concerning speaker changes. We show that it is important to specialize the generic segment representation in order to capture the most relevant information for each level. On the other hand, the summary of the dialog history should combine information from the three levels to capture dependencies between them. Furthermore, the labels generated for each level help in the prediction of those of the lower levels. Overall, we achieve results which surpass those of our previous approach using the hierarchical combination of three independent per-level classifiers. Furthermore, the results even surpass the results achieved on the simplified version of the problem approached by previous studies, which neglected the multi-label nature of the bottom levels and only considered the label combinations present in the corpus.

[1]  Gholamreza Haffari,et al.  A Latent Variable Recurrent Neural Network for Discourse Relation Language Models , 2016, ArXiv.

[2]  Juan José del Coz,et al.  Optimizing different loss functions in multilabel classifications , 2014, Progress in Artificial Intelligence.

[3]  Ricardo Ribeiro,et al.  The Influence of Context on Dialogue Act Recognition , 2015, ArXiv.

[4]  François Chollet,et al.  Keras: The Python Deep Learning library , 2018 .

[5]  Carlos D. Martínez-Hinarejos,et al.  Statistical framework for a Spanish spoken dialogue corpus , 2008, Speech Commun..

[6]  Elizabeth Shriberg,et al.  Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual , 1997 .

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Carlos D. Martínez-Hinarejos,et al.  Unsegmented Dialogue Act Annotation and Decoding With N-Gram Transducers , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[9]  Emilio Sanchis Arnal,et al.  A Labelling Proposal to Annotate Dialogues , 2002, LREC.

[10]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11]  Eduardo Lleida,et al.  Design and acquisition of a telephone spontaneous speech dialogue corpus in Spanish: DIHANA , 2006, LREC.

[12]  Mohammad S. Sorower A Literature Survey on Algorithms for Multi-label Learning , 2010 .

[13]  Franck Dernoncourt,et al.  Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks , 2016, NAACL.

[14]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[15]  C. Martínez-Hinarejos,et al.  DIALOG ACT LABELING IN THE DIHANA CORPUS USING PROSODY INFORMATION , 2008 .

[16]  Fredrik Olsson,et al.  Active Learning for Dialogue Act Classification , 2011, INTERSPEECH.

[17]  Ricardo Ribeiro,et al.  Hierarchical Multi-Label Dialog Act Recognition on Spanish Data , 2019, ArXiv.

[18]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[19]  Phil Blunsom,et al.  Recurrent Convolutional Neural Networks for Discourse Compositionality , 2013, CVSM@ACL.

[20]  R. Granell,et al.  Acquisition and Labelling of a Spontaneous Speech Dialogue Corpus ∗ , 2005 .

[21]  Ricardo Ribeiro,et al.  A Study on Dialog Act Recognition using Character-Level Tokenization , 2018, AIMSA.

[22]  Ricardo Ribeiro,et al.  Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations , 2018, J. Artif. Intell. Res..

[23]  Pavel Král,et al.  Dialogue Act Recognition Approaches , 2010, Comput. Informatics.

[24]  Yun Lei,et al.  Using Context Information for Dialog Act Classification in DNN Framework , 2017, EMNLP.

[25]  Rodney D. Nielsen,et al.  Dialogue Act Classification in Domain-Independent Conversations Using a Deep Recurrent Neural Network , 2016, COLING.