论文信息 - Flexible transcription alignment

Flexible transcription alignment

Presents a set of techniques that we employed in our Janus Recognition Toolkit (JRTk) Switchboard and CallHome recognizer in order to deal with imperfections in the transcriptions: inconsistent transcription of pronunciations and contractions, as well as errors in utterance segmentations. These techniques consist of a dynamic, speaking-mode-dependent pronunciation model and a flexible utterance alignment procedure which is based on speaker-adapted models (label boosting). The idea is (a) to automatically retranscribe the training corpus based on these models and procedures, (b) to train a recognizer based on these flexible transcription graphs, and (c) to decode with a dynamic speaking-mode-dependent dictionary. The framework is successfully applied to increase the performance of our state-of-the-art JRTk Switchboard recognizer significantly.

Alex Waibel | Michael Finke

[1] Alexander H. Waibel,et al. Recognition of conversational telephone speech using the JANUS speech engine , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] Alex Waibel,et al. Modeling Systematic Variations in Pronunciation via a Language-Dependent Hidden Speaking Mode , 1999 .

[3] Alexander H. Waibel,et al. Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition , 1997, EUROSPEECH.

[4] Daniel Jurafsky,et al. Building multiple pronunciation models for novel words using exploratory computational phonology , 1995, EUROSPEECH.