Mapping the Dialog Act Annotations of the LEGO Corpus into ISO 24617-2 Communicative Functions

ISO 24617-2, the ISO standard for dialog act annotation, sets the ground for more comparable research in the area. However, the amount of data annotated according to it is still reduced, which impairs the development of approaches for automatic recognition. In this paper, we describe a mapping of the original dialog act labels of the LEGO corpus, which have been neglected, into the communicative functions of the standard. Although this does not lead to a complete annotation according to the standard, the 347 dialogs provide a relevant amount of data that can be used in the development of automatic communicative function recognition approaches, which may lead to a wider adoption of the standard. Using the 17 English dialogs of the DialogBank as gold standard, our preliminary experiments have shown that including the mapped dialogs during the training phase leads to improved performance while recognizing communicative functions in the Task dimension.

[1]  Wolfgang Minker,et al.  On Quality Ratings for Spoken Dialogue Systems – Experts vs. Users , 2013, NAACL.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Petr Motlícek,et al.  The DBOX Corpus Collection of Spoken Human-Human and Human-Machine Dialogues , 2014, LREC.

[4]  Maxine Eskénazi,et al.  Doing research on a deployed spoken dialogue system: one year of let's go! experience , 2006, INTERSPEECH.

[5]  Kôiti Hasida,et al.  ISO 24617-2: A semantically-based standard for dialogue annotation , 2012, LREC.

[6]  Ricardo Ribeiro,et al.  Assessing User Expertise in Spoken Dialog System Interactions , 2016, IberSPEECH.

[7]  Ricardo Ribeiro,et al.  Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations , 2018, J. Artif. Intell. Res..

[8]  Elizabeth Shriberg,et al.  Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual , 1997 .

[9]  A. Koller,et al.  Speech Acts: An Essay in the Philosophy of Language , 1969 .

[10]  Wolfgang Minker,et al.  A Parameterized and Annotated Spoken Dialog Corpus of the CMU Let’s Go Bus Information System , 2012, LREC.

[11]  Stan Matwin,et al.  Functional Annotation of Genes Using Hierarchical Text Categorization , 2005 .

[12]  Harry Bunt,et al.  The DialogBank: dialogues with interoperable annotations , 2016, Language Resources and Evaluation.

[13]  David Griol,et al.  A Two-Stage Combining Classifier Model for the Development of Adaptive Dialog Systems , 2016, Int. J. Neural Syst..

[14]  Ricardo Ribeiro,et al.  A Multilingual and Multidomain Study on Dialog Act Recognition Using Character-Level Tokenization , 2019, Inf..

[15]  Harry Bunt,et al.  Dialogue Act Annotation with the ISO 24617-2 Standard , 2017 .

[16]  Harry Bunt,et al.  The DIAMOND project , 2004 .

[17]  Lenhart K. Schubert,et al.  The TRAINS Project , 1991 .

[18]  Eugene Semenkin,et al.  Multicriteria neural network design in the speech-based emotion recognition problem , 2015, 2015 12th International Conference on Informatics in Control, Automation and Robotics (ICINCO).

[19]  M K Tanenhaus,et al.  Functional clauses and sentence segmentation. , 1978, Journal of speech and hearing research.

[20]  Wolfgang Minker,et al.  Speaker state recognition with neural network-based classification and self-adaptive heuristic feature selection , 2014, 2014 11th International Conference on Informatics in Control, Automation and Robotics (ICINCO).

[21]  Lou Boves,et al.  A spoken dialog system for the Dutch public transport information service , 1997, Int. J. Speech Technol..