Modeling other talkers for improved dialog act recognition in meetings

Automatic dialog act (DA) modeling has been shown to benefit meeting understanding, but current approaches to DA recognition tend to suffer from a common problem: they underrepresent behaviors found at turn edges, during which the “floor” is negotiated among meeting participants. We propose a new approach that takes into account speech from other talkers, relying only on speech/non-speech information from all participants. We find (1) that modeling other participants improves DA detection, even in the absence of other information, (2) that only the single locally most talkative other participant matters, and (3) that 10 seconds provides a sufficiently large local context. Results further show significant performance improvements over a lexical-only system — particularly for the DAs of interest. We conclude that interaction-based modeling at turn edges can be achieved by relatively simple features and should be incorporated for improved meeting understanding.

[1]  Samy Bengio,et al.  Automatic analysis of multimodal group actions in meetings , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Gökhan Tür,et al.  Extracting question/answer pairs in multi-party meetings , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Alessandro Vinciarelli,et al.  Role recognition in multiparty recordings using social affiliation networks and discrete distributions , 2008, ICMI '08.

[4]  Elizabeth Shriberg,et al.  The ICSI Meeting Recorder Dialog Act (MRDA) Corpus , 2004, SIGDIAL Workshop.

[5]  J. M. Dabbs,et al.  Dimensions of Group Process: Amount and Structure of Vocal Interaction , 1987 .

[6]  Oliver Brdiczka,et al.  Automatic detection of interaction groups , 2005, ICMI '05.

[7]  Andreas Stolcke,et al.  The ICSI Meeting Corpus , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[8]  Daniel Gatica-Perez,et al.  Detection and application of influence rankings in small group meetings , 2006, ICMI '06.

[9]  Alexander I. Rudnicky,et al.  Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants , 2004, INTERSPEECH.

[10]  Tanja Schultz,et al.  Modeling Vocal Interaction for Text-Independent Participant Characterization in Multi-Party Conversation , 2008, SIGDIAL Workshop.

[11]  E. Schegloff,et al.  A simplest systematics for the organization of turn-taking for conversation , 1974 .

[12]  Elizabeth Shriberg,et al.  Automatic dialog act segmentation and classification in multiparty meetings , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[13]  Elizabeth Shriberg,et al.  Relationship between dialogue acts and hot spots in meetings , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).