INTEGRATING TEXT AND PHONETIC INFORMATION FOR ROBUST STATISTICAL SPEECH TRANSLATION

This paper focuses on the use of both text and phonetic information in a speech translation system in order to make translation results more robust to speech recognition errors. Conventional statistical speech translation formulas are extended to exploit both text-form and phonetic speech recognition results. A novel data-driven word/text tying algorithm is then proposed to group words based on both pronunciation similarity and meaning equivalency. In our speech-to-text translation experiments, significant improvement was achieved by using phonetic information and the proposed word tying algorithm.

[1]  William J. Byrne,et al.  Statistical Phrase-Based Speech Translation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[2]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[3]  Taro Watanabe,et al.  A Unified Approach in Speech-to-Speech Translation: Integrating Features of Speech recognition and Machine Translation , 2004, COLING.

[4]  Hermann Ney,et al.  On the integration of speech recognition and statistical machine translation , 2005, INTERSPEECH.

[5]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6]  Alexander H. Waibel,et al.  Word clustering with parallel spoken language corpora , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[8]  Hermann Ney,et al.  Speech translation: coupling of recognition and translation , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[9]  EstimationPeter,et al.  The Mathematics of Machine Translation : Parameter , 2004 .

[10]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[11]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.