Enriching spoken language translation with dialog acts

Current statistical speech translation approaches predominantly rely on just text transcripts and do not adequately utilize the rich contextual information such as conveyed through prosody and discourse function. In this paper, we explore the role of context characterized throughdialog acts(DAs) in statistical translation. We demonstrate the integration of the dialog acts in a phrase-based statistical translation framework, employing 3 limited domain parallel corpora (Farsi-English, Japanese-English and Chinese-English). For all three language pairs, in addition to producing interpretable DA enriched target language translations, we also obtain improvements in terms of objective evaluation metrics such as lexical selection accuracy and BLEU score.

[1]  Norbert Reithinger,et al.  Predicting dialogue acts for a speech-to-speech translation system , 1996 .

[2]  Michael Picheny,et al.  Concept-Based Speech-to-Speech Translation Using Maximum Entropy Models for Statistical Natural Concept Generation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Patrick Haffner,et al.  Scaling large margin classifiers for spoken language understanding , 2006, Speech Commun..

[4]  Alexander H. Waibel,et al.  Concept-based speech translation , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Kristin Precoda,et al.  Speech Recognition Engineering Issues in Speech to Speech Translation System Design for Low Resource Languages and Domains , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  Jordi Adell,et al.  Prosody Generation for Speech-to-Speech Translation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[7]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.