Speech Recognition Issues for Dutch Spoken Document Retrieval

In this paper, ongoing work on the development of the speech recognition modules of MMIR environment for Dutch is described. The work on the generation of acoustic models and language models along with their current performance is presented. Some characteristics of the Dutch language and of the target video archives that require special treatment are discussed.

[1]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[2]  Steve Renals,et al.  The THISL SDR System At TREC-8 , 1999, TREC.

[3]  Lori Lamel,et al.  Investigating text normalization and pronunciation variants for German broadcast transcription , 2000, INTERSPEECH.

[4]  Jean-Luc Gauvain,et al.  Developments in continuous speech dictation using the ARPA WSJ task , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Steve Renals,et al.  THE USE OF RECURRENT NEURAL NETWORKS IN CONTINUOUS SPEECH RECOGNITION , 1996 .

[6]  Djoerd Hiemstra,et al.  Language-Based Multimedia Information Retrieval , 2000, RIAO.

[7]  Jean-Luc Gauvain,et al.  Language modeling for broadcast news transcription , 1999, EUROSPEECH.

[8]  Ronald Rosenfeld,et al.  Optimizing lexical and N-gram coverage via judicious use of linguistic data , 1995, EUROSPEECH.

[9]  Lori Lamel,et al.  The Use of Lexica in Automatic Speech Recognition , 2000 .

[10]  Charles L. Wayne Multilingual Topic Detection and Tracking: Successful Research Enabled by Corpora and Evaluation , 2000, LREC.

[11]  A.P.J. van den Bosch,et al.  Learning to pronounce written words : a study in inductive language learning , 1997 .

[12]  Lori Lamel,et al.  Developments in large vocabulary, continuous speech recognition of German , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[13]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[14]  de Franciska Jong,et al.  OLIVE: Speech-Based Video Retrieval , 1998 .

[15]  Jean-Luc Gauvain,et al.  Developments in continuous speech dictation using the 1995 ARPA NAB news task , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[16]  Djoerd Hiemstra,et al.  Language Technology in Multimedia Information Retrieval. Twente Workshop on Language Technology 14 , 1998 .