论文信息 - Unsupervised vocabulary selection for real-time speech recognition of lectures

Unsupervised vocabulary selection for real-time speech recognition of lectures

In this work, we propose a novel method for vocabulary selection to automatically adapt automatic speech recognition systems to the diverse topics that occur in educational and scientific lectures. Utilizing materials that are available before the lecture begins, such as lecture slides, our proposed framework iteratively searches for related documents on the web and generates a lecture-specific vocabulary based on the resulting documents. In this paper, we propose a novel method for vocabulary selection where we first collect documents similar to an initial seed document and then rank the resulting vocabulary based on a score which is calculated using a combination of word features. This is a critical component for adaptation that has typically been overlooked in prior works. On the inter ACT German-English simultaneous lecture translation system our proposed approach significantly improved vocabulary coverage, reducing the out-of-vocabulary rate, on average by 57.0% and up to 84.9%, compared to a lecture-independent baseline. Furthermore, our approach reduced the word error rate, by 12.5% on average and up to 25.3%, compared to a lecture-independent baseline.

Ian R. Lane | Alexander H. Waibel | Paul Maergner

[1] Tatsuya Kawahara,et al. Automatic lecture transcription by exploiting presentation slide information for language model adaptation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2] James R. Glass,et al. Recent progress in the MIT spoken lecture processing project , 2007, INTERSPEECH.

[3] Gerald Penn,et al. Web-based language modelling for automatic lecture transcription , 2007, INTERSPEECH.

[4] Jan Niehues,et al. Quaero Speech-to-Text and Text Translation Evaluation Systems , 2010, High Performance Computing in Science and Engineering.

[5] Jan Niehues,et al. Simultaneous German-English lecture translation , 2008, IWSLT.

[6] Paul Maergner,et al. Unsupervised Vocabulary Selection for Domain-Independent Simultaneous Lecture Translation , 2011, MTSUMMIT.

[7] Hiroki Yamazaki,et al. Dynamic language model adaptation using presentation slides for lecture speech recognition , 2007, INTERSPEECH.