论文信息 - On automatic annotation of meeting databases

On automatic annotation of meeting databases

In this paper, we discuss meetings as an application domain for multimedia content analysis. Meeting databases are a rich data source suitable for a variety of audio, visual and multi-modal tasks, including speech recognition, people and action recognition, and information retrieval. We specifically focus on the task of semantic annotation of audio-visual (AV) events, where annotation consists of assigning labels (event names) to the data. In order to develop an automatic annotation system in a principled manner, it is essential to have a well-defined task, a standard corpus and an objective performance measure. In this work we address each of these issues to automatically annotate events based on participant interactions.

[1] Hagen Soltau,et al. Advances in automatic meeting record creation and access , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2] Andreas Stolcke,et al. The Meeting Project at ICSI , 2001, HLT.

[3] Tiecheng Liu,et al. A hidden Markov model approach to the structure of documentaries , 2000, 2000 Proceedings Workshop on Content-based Access of Image and Video Libraries.

[4] Iain McCowan,et al. Location based speaker segmentation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[5] Shih-Fu Chang,et al. Structure analysis of soccer video with hidden Markov models , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6] Samy Bengio,et al. Modeling human interaction in meetings , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[8] Stefan Eickeler,et al. Content-based video indexing of TV broadcast news using hidden Markov models , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[9] Shiqiang Yang,et al. Motion based event recognition using HMM , 2002, Object recognition supported by user interaction for service robots.

[10] Zhu Liu,et al. Multimedia content analysis-using both audio and visual clues , 2000, IEEE Signal Process. Mag..

[11] Anoop Gupta,et al. Distributed meetings: a meeting capture and broadcasting system , 2002, MULTIMEDIA '02.

[12] Shih-Fu Chang,et al. The holy grail of content-based media analysis , 2002 .

[13] E.,et al. GROUPS : INTERACTION AND PERFORMANCE , 2001 .