Interpretation of Multiparty Meetings the AMI and Amida Projects

The AMI and AMIDA projects are collaborative EU projects concerned with the automatic recognition and interpretation of multiparty meetings. This paper provides an overview of the advances we have made in these projects with a particular focus on the multimodal recording infrastructure, the publicly available AMI corpus of annotated meeting recordings, and the speech recognition framework that we have developed for this domain.

[1]  Ivan Himawan,et al.  Microphone Array Beamforming Approach to Blind Speech Separation , 2007, MLMI.

[2]  Jithendra Vepa,et al.  Direct optimisation of a multilayer perceptron for the estimation of cepstral mean and variance statistics , 2007, INTERSPEECH.

[3]  Jean-Marc Odobez,et al.  A Study on Visual Focus of Attention Recognition from Head Pose in a Meeting Room , 2006, MLMI.

[4]  Gerhard Rigoll,et al.  Using Audio, Visual, and Lexical Features in a Multi-modal Virtual Meeting Director , 2006, MLMI.

[5]  Daniel Gatica-Perez,et al.  Detection and application of influence rankings in small group meetings , 2006, ICMI '06.

[6]  Johanna D. Moore,et al.  Incorporating Speaker and Discourse Features into Speech Summarization , 2006, NAACL.

[7]  Lukás Burget,et al.  The AMI System for the Transcription of Speech in Meetings , 2007, ICASSP.

[8]  Hagen Soltau,et al.  Advances in automatic meeting record creation and access , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[9]  Johanna D. Moore,et al.  AUTOMATIC TOPIC SEGMENTATION AND LABELING IN MULTIPARTY DIALOGUE , 2006, 2006 IEEE Spoken Language Technology Workshop.

[10]  Pavel Matejka,et al.  Towards Lower Error Rates in Phoneme Recognition , 2004, TSD.

[11]  David A. van Leeuwen,et al.  The TNO Speaker Diarization System for NIST RT05s Meeting Data , 2005, MLMI.

[12]  I. McCowan,et al.  The multi-channel Wall Street Journal audio visual corpus (MC-WSJ-AV): specification and initial experiments , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[13]  Jean-Marc Odobez,et al.  Audiovisual Probabilistic Tracking of Multiple Speakers in Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  David A. van Leeuwen,et al.  The AMI Speaker Diarization System for NIST RT06s Meeting Data , 2006, MLMI.

[15]  Stefan Schacht,et al.  To separate speech: a system for recognizing simultaneous speech , 2007, ICML 2007.

[16]  Steve Renals,et al.  Recognition of Dialogue Acts in Multiparty Meetings Using a Switching DBN , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Jean Carletta,et al.  The NITE XML Toolkit: Data Model and Query Language , 2005, Lang. Resour. Evaluation.

[18]  Jean Carletta,et al.  Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus , 2007, Lang. Resour. Evaluation.

[19]  Andreas Stolcke,et al.  Meetings about meetings: research at ICSI on speech in multiparty conversations , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[20]  Lukás Burget,et al.  The AMI System for the Transcription of Speech in Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.