Transcription of broadcast news

In this paper we report on our recent work in transcribing broadcast news shows. Radio and television broadcasts contain signal segments of various linguistic and acoustic natures. The shows contain both prepared and spontaneous speech. The signal may be studio quality or have been transmitted over a telephone or other noisy channel (ie., corrupted by additive noise and nonlinear distorsions), or may contain speech over music. Transcription of this type of data poses challenges in dealing with the continuous stream of data under varying conditions. Our approach to this problem is to segment the data into a set of categories, which are then processed with category specific acoustic models. We describe our 65k speech recognizer and experiments using different sets of acoustic models for transcription of broadcast news data. The use of prior knowledge of the segment boundaries and types is shown to not crucially affect the performance.

[1]  Lori Lamel,et al.  Transcribing broadcast news shows , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Jean-Luc Gauvain,et al.  Transcribing Broadcast News: The LIMSI Nov96 Hub4 System , 1997 .

[3]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[4]  Jonathan G. Fiscus,et al.  1996 PRELIMINARY BROADCAST NEWS BENCHMARK TESTS , 1996 .

[5]  Jean-Luc Gauvain,et al.  Developments in continuous speech dictation using the ARPA WSJ task , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[6]  Francis Kubala,et al.  Modeling Those F-Conditions - Or Not , 1997 .

[7]  Jean-Luc Gauvain,et al.  Developments in continuous speech dictation using the 1995 ARPA NAB news task , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[8]  Lori Lamel,et al.  Speaker-independent continuous speech dictation , 1993, Speech Communication.

[9]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..