INFERRING DISCOURSE STRUCTURE FROM SPEECH

The goal of the CLARITY project is to explore the use of discourse structure in the understanding of conversational speech. Within project CLARITY we aim to develop automatic classifiers for three levels of discourse structure in Spanish telephone conversations: speech acts, dialogue games, and discourse segments. This paper presents our first results and research plans in three areas: definition of discourse structure units and manual annotation of CALLHOME SPANISH, speech recognition, and automated segmentation and labeling of speech acts.

[1]  Elmar Nöth,et al.  Improving parsing of spontaneous speech with the help of prosodic boundaries , 1997 .

[2]  Mark G. Core,et al.  Coding Dialogs with the DAMSL Annotation Scheme , 1997 .

[3]  Lauri Carlson Dialogue Games: An Approach to Discourse Analysis , 1982 .

[4]  Julia Hirschberg,et al.  Instructions for annotating discourse , 1995 .

[5]  Giuseppe Riccardi,et al.  Automatic acquisition of salient grammar fragments for call-type classification , 1997, EUROSPEECH.

[6]  Alexander H. Waibel,et al.  Learning to parse spontaneous speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  A. Stolcke,et al.  Automatic detection of discourse structure for speech recognition and understanding , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[8]  Klaus Zechner,et al.  High Performance Segmentation of Spontaneous Speech Using Part of Speech and Trigger Word Information , 1997, ANLP.

[9]  Elmar Nöth,et al.  Improving parsing of spontaneous speech with the help of prosodic boundaries , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Norbert Reithinger,et al.  Insights into the Dialogue Processing of VERBMOBIL , 1997, ANLP.

[11]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[12]  Klaus Ries,et al.  The Karlsruhe-Verbmobil speech recognition engine , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[14]  Gwyneth Doherty-Sneddon,et al.  The Reliability of a Dialogue Structure Coding Scheme , 1997, CL.

[15]  A. Tomlinson POWER , 1998, The Palgrave Encyclopedia of Imperialism and Anti-Imperialism.

[16]  James A. Moore,et al.  Dialogue-Games: Metacommunication Structures for Natural Language Interaction , 1977, Cogn. Sci..

[17]  Finn Dag Buø FeasPar - a feature structure parser learning to parse spontaneous speech , 1996 .

[18]  Rebecca J. Passonneau,et al.  Discourse Segmentation by Human and Automated Means , 1997, CL.

[19]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[20]  C. L. Sidner,et al.  Attention, Intention and the Structure of Discourse. Technical Report No. 380. , 1986 .

[21]  Michael I. Jordan Why the logistic function? A tutorial discussion on probabilities and neural networks , 1995 .

[22]  Ajay Naresh Jain,et al.  Parsec: a connectionist learning architecture for parsing spoken language , 1992 .

[23]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .