Disfluency and Laughter Annotation in a Light-weight Dialogue Mark-up Protocol

Despite a great deal of research effort, disfluency and laughter annotation is still an unsolved problem, both in terms of consensus for a general applicable system, and in terms of annotation agreement metrics. In this paper we present a new annotation scheme within a light-weight mark-up for spontaneous speech. We show, despite the low overhead required for understanding the annotation protocol, it allows for good inter-annotator agreement and can be used to map onto existing disfluency categorization, with no loss of information.

[1]  M. Inés Torres,et al.  Annotation and analysis of disfluencies in a spontaneous speech corpus in Spanish , 2001, DiSS.

[2]  Jan Alexandersson,et al.  A Comprehensive Disfluency Model for Multi-Party Interaction , 2007, SIGdial.

[3]  Justus J. Randolph Free-Marginal Multirater Kappa (multirater K[free]): An Alternative to Fleiss' Fixed-Marginal Multirater Kappa. , 2005 .

[4]  Wolfgang Wahlster,et al.  Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[5]  David Schlangen,et al.  MINT.tools: tools and adaptors supporting acquisition, annotation and analysis of multimodal corpora , 2013, INTERSPEECH.

[6]  Elisabeth Schriberg,et al.  Preliminaries to a Theory of Speech Disfluencies , 1994 .

[7]  Klaus J. Kohler,et al.  Labelled data bank of spoken standard German: the Kiel corpus of read/spontaneous speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[8]  Thomas C. Schmidt,et al.  Technological and methodological challenges in creating, annotating and sharing a learner corpus of spoken German , 2012 .

[9]  W. Levelt,et al.  Monitoring and self-repair in speech , 1983, Cognition.

[10]  Matthew Purver,et al.  Modelling Expectation in the Self-Repair Processing of Annotat-, um, Listeners , 2013 .

[11]  Florian Schiel,et al.  Verbmobil Data Collection and Annotation , 2000 .

[12]  Helena Moniz,et al.  Speaking style effects in the production of disfluencies , 2014, Speech Commun..

[13]  Florian Schiel,et al.  Alcohol language corpus: the first public corpus of alcoholized German speech , 2012, Lang. Resour. Evaluation.

[14]  Jonathan Ginzburg,et al.  The Disfluency, Exclamation and Laughter in Dialogue (DUEL) Project , 2014 .