Recognition and understanding of meetings the AMI and AMIDA projects

The AMI and AMIDA projects are concerned with the recognition and interpretation of multiparty meetings. Within these projects we have: developed an infrastructure for recording meetings using multiple microphones and cameras; released a 100 hour annotated corpus of meetings; developed techniques for the recognition and interpretation of meetings based primarily on speech recognition and computer vision; and developed an evaluation framework at both component and system levels. In this paper we present an overview of these projects, with an emphasis on speech recognition and content extraction.

[1]  Steve Whittaker,et al.  A meeting browser evaluation test , 2005, CHI Extended Abstracts.

[2]  Jean Carletta,et al.  The NITE XML Toolkit: Data Model and Query Language , 2005, Lang. Resour. Evaluation.

[3]  Rick Kazman,et al.  Four Paradigms for Indexing Video Conferences , 1996, IEEE Multim..

[4]  Anoop Gupta,et al.  Viewing meeting captured by an omni-directional camera , 2001, CHI.

[5]  Jean Carletta,et al.  Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus , 2007, Lang. Resour. Evaluation.

[6]  I. McCowan,et al.  PROBABILISTIC TRACKING OF MULTIPLE SPEAKERS IN MEETINGS , 2007 .

[7]  Gerhard Rigoll,et al.  Using Audio, Visual, and Lexical Features in a Multi-modal Virtual Meeting Director , 2006, MLMI.

[8]  Lukás Burget,et al.  The AMI System for the Transcription of Speech in Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[9]  Lukás Burget,et al.  Application of CMLLR in narrow band wide band adapted systems , 2007, INTERSPEECH.

[10]  Jithendra Vepa,et al.  Direct optimisation of a multilayer perceptron for the estimation of cepstral mean and variance statistics , 2007, INTERSPEECH.

[11]  Jean-Marc Odobez,et al.  A Study on Visual Focus of Attention Recognition from Head Pose in a Meeting Room , 2006, MLMI.

[12]  Martial Michel,et al.  The NIST Meeting Room Pilot Corpus , 2004, LREC.

[13]  Roeland Ordelman,et al.  Transcription of conference room meetings: an investigation , 2005, INTERSPEECH.

[14]  Johanna D. Moore,et al.  AUTOMATIC TOPIC SEGMENTATION AND LABELING IN MULTIPARTY DIALOGUE , 2006, 2006 IEEE Spoken Language Technology Workshop.

[15]  Thomas Hain,et al.  Strategies for Language Model Web-Data Collection , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[16]  Steve Renals,et al.  Dialogue act compression via pitch contour preservation , 2006, INTERSPEECH.

[17]  David M. Roy AUDIO MEETING HISTORY TOOL: INTERACTIVE GRAPHICAL USER-SUPPORT FOR VIRTUAL AUDIO MEETINGS , 1999 .

[18]  Pavel Matejka,et al.  Towards Lower Error Rates in Phoneme Recognition , 2004, TSD.

[19]  Jonathan Le Roux,et al.  Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Andreas Stolcke,et al.  Meetings about meetings: research at ICSI on speech in multiparty conversations , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[21]  Steve Renals,et al.  DBN Based Joint Dialogue Act Recognition of Multiparty Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[22]  Mary P. Harper,et al.  VACE Multimodal Meeting Corpus , 2005, MLMI.

[23]  Lukás Burget,et al.  The 2005 AMI System for the Transcription of Speech in Meetings , 2005, MLMI.

[24]  Andreas Stolcke,et al.  The ICSI Meeting Corpus , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[25]  Johanna D. Moore,et al.  Incorporating Speaker and Discourse Features into Speech Summarization , 2006, NAACL.

[26]  Mark J. F. Gales,et al.  MMI-MAP and MPE-MAP for acoustic model adaptation , 2003, INTERSPEECH.

[27]  Berna Erol,et al.  Portable meeting recorder , 2002, MULTIMEDIA '02.

[28]  Jithendra Vepa,et al.  The segmentation of multi-channel meeting recordings for automatic speech recognition , 2006, INTERSPEECH.

[29]  Susanne Burger,et al.  The ISL meeting corpus: the impact of meeting type on speech style , 2002, INTERSPEECH.

[30]  Mirjam Huis in 't Veld,et al.  Evaluating meeting support tools , 2008, Personal and Ubiquitous Computing.

[31]  David A. van Leeuwen,et al.  The AMI Speaker Diarization System for NIST RT06s Meeting Data , 2006, MLMI.

[32]  Andreas Stolcke,et al.  PROGRESS IN MEETING RECOGNITION: THE ICSI-SRI-UW SPRING 2004 EVALUATION SYSTEM , 2008 .

[33]  Eric Fosler-Lussier,et al.  Discourse Segmentation of Multi-Party Conversation , 2003, ACL.

[34]  David A. van Leeuwen,et al.  The 2007 AMI(DA) System for Meeting Transcription , 2007, CLEAR.

[35]  Hagen Soltau,et al.  Advances in automatic meeting record creation and access , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[36]  Daniel Gatica-Perez,et al.  Detection and application of influence rankings in small group meetings , 2006, ICMI '06.