Modeling vocal interaction for text-independent detection of involvement hotspots in multi-party meetings

Indexing, retrieval, and summarization in recordings of meetings have, to date, focused largely on the propositional content of what participants say. Although objectively relevant, such content may not be the sole or even the main aim of potential system users. Instead, users may be interested in information bearing on conversation flow. We explore the automatic detection of one example of such information, namely that of hotspots defined in terms of participant involvement. Our proposed system relies exclusively on low-level vocal activity features, and yields a classification accuracy of 84%, representing a 39% reduction of error relative to a baseline which selects the majority class.

[1]  Andreas Stolcke,et al.  The ICSI Meeting Corpus , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[2]  Tanja Schultz,et al.  Detection of Laughter-in-Interaction in Multichannel Close-Talk Microphone Recordings of Meetings , 2008, MLMI.

[3]  S. Burger,et al.  On the Correlation between Perceptual and Contextual Aspects of Laughter in Meetings , 2007 .

[4]  Tanja Schultz,et al.  Modeling Vocal Interaction for Segmentation in Meeting Recognition , 2007, MLMI.

[5]  Elizabeth Shriberg,et al.  Spotting "hot spots" in meetings: human judgments and prosodic cues , 2003, INTERSPEECH.

[6]  Elizabeth Shriberg,et al.  Overlap in Meetings: ASR Effects and Analysis by Dialog Factors, Speakers, and Collection Site , 2006, MLMI.

[7]  J. M. Dabbs,et al.  Dimensions of Group Process: Amount and Structure of Vocal Interaction , 1987 .

[8]  A. Fogel,et al.  The integration of laughter and speech in vocal communication: a dynamic systems perspective. , 1999, Journal of speech, language, and hearing research : JSLHR.

[9]  Kornel Laskowski,et al.  Analysis of the occurrence of laughter in meetings , 2007, INTERSPEECH.

[10]  David A. van Leeuwen,et al.  Automatic discrimination between laughter and speech , 2007, Speech Commun..

[11]  Daniel P. W. Ellis,et al.  Laughter Detection in Meetings , 2004 .

[12]  Elizabeth Shriberg,et al.  The ICSI Meeting Recorder Dialog Act (MRDA) Corpus , 2004, SIGDIAL Workshop.

[13]  Larry Wasserman,et al.  All of Statistics: A Concise Course in Statistical Inference , 2004 .

[14]  Nikki Mirghafori,et al.  Automatic laughter detection using neural networks , 2007, INTERSPEECH.

[15]  Elizabeth Shriberg,et al.  Relationship between dialogue acts and hot spots in meetings , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[16]  F. Guerra Spin Glasses , 2005, cond-mat/0507581.

[17]  Britta Wrede,et al.  Meeting Recorder Project: Hot Spot Labeling Guide , 2007 .