论文信息 - Reconnaissance de la parole guidée par des transcriptions approchées

Reconnaissance de la parole guidée par des transcriptions approchées

In many cases, an approximated transcript can be associated to speech signal : movies subtitles, scenario and theatre, summaries and radio broadcast. These transcripts correspond rarely to the exact word utterances. The goal of this work is to use these information to improve the performance of an automatic speech recognition (ASR) system with integration of information resulting from the transcripts. In this paper we use the partial transcript in order both to adapt the language model and to rescore the ASR word hypothesis when the partial transcript matches the input signal. Multiple applications are possible : to help deaf people to follow a play with closed caption aligned to the voice signal (with respect to performer variations), to watch a movie in another language using aligned closed captions, to transcript in real time debates or meetings.

Georges Linarès | Benjamin Lecouteux | Jean-François Bonastre | Pascal Nocera

[1] Alexander G. Hauptmann,et al. Improving acoustic models with captioned multimedia speech , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[2] Georges Linarès,et al. Scalable language model look-ahead for LVCSR , 2005, INTERSPEECH.

[3] Pedro J. Moreno,et al. A recursive algorithm for the forced alignment of very long audio segments , 1998, ICSLP.

[4] Georges Linarès,et al. Principes et performances du décodeur parole continue Speeral , 2002 .

[5] Chih-Wei Huang. Automatic Closed Caption Alignment Based on Speech Recognition Transcripts , 2003 .

[6] John D. Lafferty,et al. Cheating with imperfect transcripts , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7] Jean-Luc Gauvain,et al. Lightly supervised and unsupervised acoustic model training , 2002, Comput. Speech Lang..

[8] Donald J. Berndt,et al. Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.