论文信息 - Meetings about meetings: research at ICSI on speech in multiparty conversations

Meetings about meetings: research at ICSI on speech in multiparty conversations

In early 2001, we reported (at the Human Language Technology meeting) the early stages of an ICSI (International Computer Science Institute) project on processing speech from meetings (in collaboration with other sites, principally SRI, Columbia, and UW). We report our progress from the first few years of this effort, including: the collection and subsequent release of a 75-meeting corpus (over 70 meeting-hours and up to 16 channels for each meeting); the development of a prosodic database for a large subset of these meetings, and its subsequent use for punctuation and disfluency detection; the development of a dialog annotation scheme and its implementation for a large subset of the meetings; and the improvement of both near-mic and far-mic speech recognition results for meeting speech test sets.

[1] Andreas Stolcke,et al. The Meeting Project at ICSI , 2001, HLT.

[2] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[3] Hynek Hermansky,et al. Qualcomm-ICSI-OGI features for ASR , 2002, INTERSPEECH.

[4] Laura Docío Fernández,et al. Far-field ASR on inexpensive microphones , 2003, INTERSPEECH.

[5] Andreas Stolcke,et al. The ICSI Meeting Corpus , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6] Larry P. Heck,et al. Modeling dynamic prosodic variation for speaker verification , 1998, ICSLP.

[7] Nelson Morgan,et al. Double the trouble: handling noise and reverberation in far-field automatic speech recognition , 2002, INTERSPEECH.

[8] Andreas Stolcke,et al. Automatic punctuation and disfluency detection in multi-party meetings using prosodic and lexical cues , 2002, INTERSPEECH.

[9] Andreas Stolcke,et al. Multispeaker speech activity detection for the ICSI meeting recorder , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[10] Gökhan Tür,et al. Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..