Multimodal meeting analysis by segmentation and classification of meeting events based on a higher level semantic approach

This paper encompasses the analysis of meetings for segmentation into sub-genres. Therefore, an approach on a higher semantic level has been chosen. The algorithms make use of the results of specialized recognizers like a speaker turn detector and a gesture recognizer. Basically, the goal of this investigation was to answer the question, how well meeting analysis is possible if only the results of these recognizers are available. After introducing briefly the basics of these recognizers, two slightly different methods for the segmentation are presented. The results show the potential of the used methods to find the segment boundaries and to categorize the detected segments into sub-genres (also called meeting events or group actions). Based on this segmentation, further analysis regarding topic detection and content extraction can be accomplished.

[1]  Ramesh A. Gopinath,et al.  Improved speaker segmentation and segments clustering using the bayesian information criterion , 1999, EUROSPEECH.

[2]  Ralph Gross,et al.  Multimodal Meeting Tracker , 2000, RIAO.

[3]  Andreas Stolcke,et al.  The Meeting Project at ICSI , 2001, HLT.

[4]  Klaus Zechner,et al.  Automatic generation of concise summaries of spoken dialogues in unrestricted domains , 2001, SIGIR '01.

[5]  Rainer Stiefelhagen,et al.  Tracking focus of attention in meetings , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[6]  Anoop Gupta,et al.  Distributed meetings: a meeting capture and broadcasting system , 2002, MULTIMEDIA '02.

[7]  Darren Moore,et al.  The IDIAP Smart Meeting Room , 2002 .

[8]  Gerhard Rigoll,et al.  Action Recognition in Meeting Scenarios using Global Motion Features , 2003 .

[9]  Samy Bengio,et al.  Modeling human interaction in meetings , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10]  Jean-Marc Odobez,et al.  Unsupervised Location-Based Segmentation of Multi-Party Speech , 2004 .