Representational issues in annotation: Using the Australian map task corpus to relate prosody and discourse structure

Abstract This paper reports part of an ongoing investigation of the interaction of prosody and discourse structure. A digital speech corpus (4 dialogues from the ANDOSL Australian map task corpus) was coded for prosodic structure (ToBI). Independently, two different coding systems for dialogue micro-structure were applied to the same corpus: the HCRC map task coding scheme (Carletta et al., 1996, 1997b) and the `Switchboard' version of the DRI/DAMSL scheme (Jurafsky et al., 1997). We investigated whether silent pause location and duration, intonational boundaries associated with Break Indices 3 and 4, as well as pitch range reset were significantly correlated with dialogue act boundaries as has been found for other varieties of English (e.g., Lehiste, 1975; Hirschberg and Nakatani, 1996; Silverman, 1987 ) and Dutch (Swerts, 1997). The dialogue coding systems were systematically evaluated both against one another and in terms of their correlation with the prosodic structure. The paper explores a number of methodological issues which arise in effectively comparing and relating structures from different domains of analysis across a large speech corpus. It also exemplifies the way in which annotated corpora can be used to evaluate theories and systems.

[1]  J.B. Millar,et al.  The Australian National Database of Spoken Language , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Jonathan Harrington,et al.  The mu + system for corpus based speech research , 1993, Comput. Speech Lang..

[3]  A. Ichikawa,et al.  An Analysis of Turn-Taking and Backchannels Based on Prosodic and Syntactic Features in Japanese Map Task Dialogs , 1998, Language and speech.

[4]  Anne H. Anderson,et al.  The Hcrc Map Task Corpus , 1991 .

[5]  Lars Ahrenberg Nils Dahlb Coding Schemes for Studies of Natural Language Dialogue , 1995 .

[6]  Andreas Stolcke,et al.  Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? , 1998, Language and speech.

[7]  Julia Hirschberg,et al.  Some intonational characteristics of discourse structure , 1992, ICSLP.

[8]  M. Swerts Prosodic features at discourse boundaries of different strength. , 1997, The Journal of the Acoustical Society of America.

[9]  I. Lehiste The Phonetic Structure of Paragraphs , 1975 .

[10]  Julia Hirschberg,et al.  A Prosodic Analysis of Discourse Segments in Direction-Giving Monologues , 1996, ACL.

[11]  D. Traum,et al.  Coding Discourse Structure in Dialogue (Version 1.0) , 1999 .

[12]  A. Cohen,et al.  Structure and Process in Speech Perception , 1975 .

[13]  Julia Hirschberg,et al.  Instructions for annotating discourse , 1995 .

[14]  Julia Hirschberg,et al.  Discourse Structure in Spoken Language: Studies on Speech Corpora , 1995 .

[15]  Gwyneth Doherty-Sneddon,et al.  The Reliability of a Dialogue Structure Coding Scheme , 1997, CL.

[16]  David R. Traum,et al.  CONVERSATION ACTS IN TASK‐ORIENTED SPOKEN DIALOGUE , 1992, Comput. Intell..

[17]  R. J. Lickley,et al.  Proceedings of the International Conference on Spoken Language Processing. , 1992 .