A hybrid phrase-based/statistical speech translation system

Spoken communication across a language barrier is of increasing importance in both civilian and military applications. In this paper, we present a system for taskdirected 2-way communication between speakers of English and Iraqi colloquial Arabic. The application domain of the system is force protection. The system supports translingual dialogue in areas that include municipal services surveys, detainee screening, and descriptions of people, houses, vehicles, etc. N-gram speech recognition is used to recognize both English and Arabic speech. The system uses a combination of a pre-recorded questions and statistical machine translation with speech synthesis to translate the recognition output.

[1]  Farzad Ehsani,et al.  Rapid Development of a Speech Translation System for Korean , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[2]  Tanja Schultz,et al.  Challenges with Rapid Adaptation of Speech Translation Systems to New Language Pairs , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[3]  Bowen Zhou,et al.  Constrained phrase-based translation using weighted finite-state transducers , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  Richard M. Schwartz,et al.  The 2004 BBN/LIMSI 20xRT English conversational telephone speech recognition system , 2005, INTERSPEECH.

[5]  Kristin Precoda,et al.  Speech translation for low-resource languages: the case of Pashto , 2005, INTERSPEECH.

[6]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[7]  Daniel Povey,et al.  Large scale discriminative training of hidden Markov models for speech recognition , 2002, Comput. Speech Lang..

[8]  Richard M. Schwartz,et al.  Towards a robust real-time decoder , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[9]  Richard M. Schwartz,et al.  Efficient 2-pass n-best decoder , 1997, EUROSPEECH.

[10]  Richard M. Schwartz,et al.  A Fully Statistical Approach to Natural Language Interfaces , 1996, ACL.