A Multilingual and Multidomain Study on Dialog Act Recognition Using Character-Level Tokenization

Automatic dialog act recognition is an important step for dialog systems since it reveals the intention behind the words uttered by its conversational partners. Although most approaches on the task use word-level tokenization, there is information at the sub-word level that is related to the function of the words and, consequently, their intention. Thus, in this study, we explored the use of character-level tokenization to capture that information. We explored the use of multiple character windows of different sizes to capture morphological aspects, such as affixes and lemmas, as well as inter-word information. Furthermore, we assessed the importance of punctuation and capitalization for the task. To broaden the conclusions of our study, we performed experiments on dialogs in three languages—English, Spanish, and German—which have different morphological characteristics. Furthermore, the dialogs cover multiple domains and are annotated with both domain-dependent and domain-independent dialog act labels. The achieved results not only show that the character-level approach leads to similar or better performance than the state-of-the-art word-level approaches on the task, but also that both approaches are able to capture complementary information. Thus, the best results are achieved by combining tokenization at both levels.

[1]  Wolfgang Minker,et al.  A Parameterized and Annotated Spoken Dialog Corpus of the CMU Let’s Go Bus Information System , 2012, LREC.

[2]  Franck Dernoncourt,et al.  Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks , 2016, NAACL.

[3]  Fredrik Olsson,et al.  Active Learning for Dialogue Act Classification , 2011, INTERSPEECH.

[4]  Ricardo Ribeiro,et al.  Hierarchical Multi-Label Dialog Act Recognition on Spanish Data , 2019, ArXiv.

[5]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[6]  Eduardo Lleida,et al.  Design and acquisition of a telephone spontaneous speech dialogue corpus in Spanish: DIHANA , 2006, LREC.

[7]  Rafael E. Banchs,et al.  The Fourth Dialog State Tracking Challenge , 2016, IWSDS.

[8]  Matthias Abt Verbmobil A Translation System For Face To Face Dialog , 2016 .

[9]  Mohammad S. Sorower A Literature Survey on Algorithms for Multi-label Learning , 2010 .

[10]  Michael Ferguson,et al.  Automatic Extraction of Cue Phrases for Cross-Corpus Dialogue Act Classification , 2010, COLING.

[11]  Yoav Goldberg,et al.  A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[12]  Pavel Král,et al.  Dialogue Act Recognition Approaches , 2010, Comput. Informatics.

[13]  Elizabeth Shriberg,et al.  Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual , 1997 .

[14]  John R. Searle,et al.  Speech Acts: An Essay in the Philosophy of Language , 1970 .

[15]  Gholamreza Haffari,et al.  A Latent Variable Recurrent Neural Network for Discourse Relation Language Models , 2016 .

[16]  Andreas Stolcke,et al.  The ICSI Meeting Corpus , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[17]  Maxine Eskénazi,et al.  Doing research on a deployed spoken dialogue system: one year of let's go! experience , 2006, INTERSPEECH.

[18]  R. Granell,et al.  Acquisition and Labelling of a Spontaneous Speech Dialogue Corpus ∗ , 2005 .

[19]  François Chollet,et al.  Keras: The Python Deep Learning library , 2018 .

[20]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[21]  Andreas Stolcke,et al.  Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.

[22]  Rodney D. Nielsen,et al.  Dialogue Act Classification in Domain-Independent Conversations Using a Deep Recurrent Neural Network , 2016, COLING.

[23]  E. Maier,et al.  Dialogue Acts in VERBMOBIL , 1995 .

[24]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[25]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[26]  M. Rotaru Dialog Systems ” class , Spring 2002-TERM PROJECT-Dialog Act Tagging using Memory-Based Learning , 2007 .

[27]  Carlos D. Martínez-Hinarejos,et al.  Statistical framework for a Spanish spoken dialogue corpus , 2008, Speech Commun..

[28]  Cícero Nogueira dos Santos,et al.  Learning Character-level Representations for Part-of-Speech Tagging , 2014, ICML.

[29]  Yun Lei,et al.  Using Context Information for Dialog Act Classification in DNN Framework , 2017, EMNLP.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Christopher D. Manning Computational Linguistics and Deep Learning , 2015, Computational Linguistics.

[32]  Norbert Reithinger,et al.  Dialogue act classification using language models , 1997, EUROSPEECH.

[33]  Ricardo Ribeiro,et al.  The Influence of Context on Dialogue Act Recognition , 2015, ArXiv.

[34]  Mari Ostendorf,et al.  Hierarchical Character-Word Models for Language Identification , 2016, SocialNLP@EMNLP.

[35]  Phil Blunsom,et al.  Recurrent Convolutional Neural Networks for Discourse Compositionality , 2013, CVSM@ACL.

[36]  Ricardo Ribeiro,et al.  A Study on Dialog Act Recognition using Character-Level Tokenization , 2018, AIMSA.

[37]  Emilio Sanchis Arnal,et al.  A Labelling Proposal to Annotate Dialogues , 2002, LREC.

[38]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[39]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.