Integrating Speech Recognition and Machine Translation

This paper presents a set of experiments that we conducted in order to optimize the performance of an Arabic/English machine translation system on broadcast news and conversational speech data. Proper integration of speech-to-text (STT) and machine translation (MT) requires special attention to issues such as sentence boundary detection, punctuation, STT accuracy, tokenization, conversion of spoken numbers and dates to written form, optimization of MT decoding weights, and scoring. We discuss these issues, and show that a carefully tuned STT/MT integration can lead to significant translation accuracy improvements compared to simply feeding the regular STT output to a text MT system.