A new Metric for the Evaluation of Dialog Act Classification ∗

The standard evaluation metrics for dialog act classifiers are based on the boolean outcome of the exact classification. For multidimensional tag sets, such as the ICSI-MRDA tag set, this is stricter than necessary, since the missclassification might be partial and this can be good enough for the application in which the classifier is embedded. We propose a new forgiving metric and show some preliminary results. Some future work is sketched.

[1]  Andrei Popescu-Belis,et al.  Multi-level Dialogue Act Tags , 2004, SIGDIAL Workshop.

[2]  Norbert Reithinger,et al.  Dialogue act classification using language models , 1997, EUROSPEECH.

[3]  Elizabeth Shriberg,et al.  Automatic dialog act segmentation and classification in multiparty meetings , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  Norbert Reithinger,et al.  Dia logue Acts in VERBMOBIL-2 Second Edition , 1997 .

[5]  Elizabeth Shriberg,et al.  The ICSI Meeting Recorder Dialog Act (MRDA) Corpus , 2004, SIGDIAL Workshop.