Hierarchical Multi-Label Dialog Act Recognition on Spanish Data

Dialog acts reveal the intention behind the uttered words. Thus, their automatic recognition is important for a dialog system trying to understand its conversational partner. The study presented in this article approaches that task on the DIHANA corpus, whose three-level dialog act annotation scheme poses problems which have not been explored in recent studies. In addition to the hierarchical problem, the two lower levels pose multi-label classification problems. Furthermore, each level in the hierarchy refers to a different aspect concerning the intention of the speaker both in terms of the structure of the dialog and the task. Also, since its dialogs are in Spanish, it allows us to assess whether the state-of-the-art approaches on English data generalize to a different language. More specifically, we compare the performance of different segment representation approaches focusing on both sequences and patterns of words and assess the importance of the dialog history and the relations between the multiple levels of the hierarchy. Concerning the single-label classification problem posed by the top level, we show that the conclusions drawn on English data also hold on Spanish data. Furthermore, we show that the approaches can be adapted to multi-label scenarios. Finally, by hierarchically combining the best classifiers for each level, we achieve the best results reported for this corpus.

[1]  Anne H. Anderson,et al.  The Hcrc Map Task Corpus , 1991 .

[2]  Elizabeth Shriberg,et al.  Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual , 1997 .

[3]  Wolfgang Minker,et al.  A Parameterized and Annotated Spoken Dialog Corpus of the CMU Let’s Go Bus Information System , 2012, LREC.

[4]  Fabio Pianesi,et al.  NESPOLE!'s Multilingual and Multimodal Corpus , 2002, LREC.

[5]  Luis A. Pineda,et al.  Predicting Obligation Dialogue Acts from Prosodic and Speaker Information , 2005 .

[6]  Klaus Ries,et al.  HMM and neural network based speech act detection , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[7]  R. Granell,et al.  Acquisition and Labelling of a Spontaneous Speech Dialogue Corpus ∗ , 2005 .

[8]  Luis Alberto Pineda,et al.  Predicting Dialogue Acts from Prosodic Information , 2006, CICLing.

[9]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[10]  Franck Dernoncourt,et al.  Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks , 2016, NAACL.

[11]  Ricardo Ribeiro,et al.  A Study on Dialog Act Recognition using Character-Level Tokenization , 2018, AIMSA.

[12]  Ricardo Ribeiro,et al.  The Influence of Context on Dialogue Act Recognition , 2015, ArXiv.

[13]  John R. Searle,et al.  Speech Acts: An Essay in the Philosophy of Language , 1970 .

[14]  Carlos D. Martínez-Hinarejos,et al.  Statistical framework for a Spanish spoken dialogue corpus , 2008, Speech Commun..

[15]  Barbara Di Eugenio,et al.  Dialogue Act Classification, Higher Order Dialogue Structure, and Instance-Based Learning , 2010 .

[16]  Emilio Sanchis Arnal,et al.  A Labelling Proposal to Annotate Dialogues , 2002, LREC.

[17]  Matthias Abt Verbmobil A Translation System For Face To Face Dialog , 2016 .

[18]  Evgeny A. Stepanov,et al.  ISO-Standard Domain-Independent Dialogue Act Tagging for Conversational Agents , 2018, COLING.

[19]  Rodney D. Nielsen,et al.  Dialogue Act Classification in Domain-Independent Conversations Using a Deep Recurrent Neural Network , 2016, COLING.

[20]  Pavel Král,et al.  Dialogue Act Recognition Approaches , 2010, Comput. Informatics.

[21]  Barbara Di Eugenio,et al.  FLSA: Extending Latent Semantic Analysis with Features for Dialogue Act Classification , 2004, ACL.

[22]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[23]  Kôiti Hasida,et al.  ISO 24617-2: A semantically-based standard for dialogue annotation , 2012, LREC.

[24]  Alon Lavie,et al.  A discourse coding scheme for conversational Spanish , 1998, ICSLP.

[25]  Barbara Di Eugenio,et al.  Latent Semantic Analysis for Dialogue Act Classification , 2003, NAACL.

[26]  François Chollet,et al.  Keras: The Python Deep Learning library , 2018 .

[27]  Jean Carletta,et al.  The AMI Meeting Corpus: A Pre-announcement , 2005, MLMI.

[28]  Jan Alexanderssony,et al.  Dialogue acts in VERBMOBIL-2 , 1997 .

[29]  Elizabeth Shriberg,et al.  The ICSI Meeting Recorder Dialog Act (MRDA) Corpus , 2004, SIGDIAL Workshop.

[30]  Phil Blunsom,et al.  Recurrent Convolutional Neural Networks for Discourse Compositionality , 2013, CVSM@ACL.

[31]  Juan José del Coz,et al.  Optimizing different loss functions in multilabel classifications , 2014, Progress in Artificial Intelligence.

[32]  Harry Bunt,et al.  The DialogBank: dialogues with interoperable annotations , 2016, Language Resources and Evaluation.

[33]  Mari Ostendorf,et al.  Domain Adaptation with Unlabeled Data for Dialog Act Tagging , 2010 .

[34]  Ricardo Ribeiro,et al.  Mapping the Dialog Act Annotations of the LEGO Corpus into the Communicative Functions of ISO 24617-2 , 2016, ArXiv.

[35]  Fredrik Olsson,et al.  Active Learning for Dialogue Act Classification , 2011, INTERSPEECH.

[36]  Lori S. Levin,et al.  Tagging of Speech Acts and Dialogue Games in Spanish Call Home , 1999 .

[37]  Rafael E. Banchs,et al.  The Fourth Dialog State Tracking Challenge , 2016, IWSDS.

[38]  Ingrid Zukerman,et al.  Preserving Distributional Information in Dialogue Act Classification , 2017, EMNLP.

[39]  Carlos D. Martínez-Hinarejos,et al.  Unsegmented Dialogue Act Annotation and Decoding With N-Gram Transducers , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[40]  M K Tanenhaus,et al.  Functional clauses and sentence segmentation. , 1978, Journal of speech and hearing research.

[41]  Gholamreza Haffari,et al.  A Latent Variable Recurrent Neural Network for Discourse Relation Language Models , 2016 .

[42]  Eduardo Lleida,et al.  Design and acquisition of a telephone spontaneous speech dialogue corpus in Spanish: DIHANA , 2006, LREC.

[43]  Yann LeCun,et al.  Very Deep Convolutional Networks for Text Classification , 2016, EACL.

[44]  Mohammad S. Sorower A Literature Survey on Algorithms for Multi-label Learning , 2010 .

[45]  Quan Hung Tran,et al.  A Hierarchical Neural Model for Learning Sequences of Dialogue Acts , 2017, EACL.

[46]  Yun Lei,et al.  Using Context Information for Dialog Act Classification in DNN Framework , 2017, EMNLP.

[47]  Elizabeth Shriberg,et al.  Automatic dialog act segmentation and classification in multiparty meetings , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[48]  E. Maier,et al.  Dialogue Acts in VERBMOBIL , 1995 .

[49]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[50]  Luis Alberto Pineda,et al.  An analysis of prosodic information for the recognition of dialogue acts in a multimodal corpus in Mexican Spanish , 2009, Comput. Speech Lang..

[51]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[52]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[53]  C. Martínez-Hinarejos,et al.  DIALOG ACT LABELING IN THE DIHANA CORPUS USING PROSODY INFORMATION , 2008 .

[54]  Ingrid Zukerman,et al.  A Generative Attentional Neural Network Model for Dialogue Act Classification , 2017, ACL.

[55]  Andreas Stolcke,et al.  Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.