Rapid development of an English/Farsi speech-to-speech translation system.

Significant advances have been achieved in Speech-to-Speech (S2S) translation systems in recent years. However, rapid configuration of S2S systems for low-resource language pairs and domains remains a challenging problem due to lack of human translated bilingual training data. In this paper, we report on an effort to port our existing English/Iraqi S2S system to the English/Farsi language pair in just 90 days, using only a small amount of training data. This effort included developing acoustic models for Farsi, domain-relevant language models for English and Farsi, and translation models for English-to-Farsi and Farsi-to-English. As part of this work, we developed two novel techniques for expanding the training data, including the reuse of data from different language pairs, and directed collection o f new data. In an independent evaluation, the resulting system achieved the highest performance of all systems.

[1]  Amit Srivastava,et al.  Online speaker adaptation and tracking for real-time speech recognition , 2005, INTERSPEECH.

[2]  Rohit Prasad,et al.  Improvements in machine translation for English/iraqi speech translation , 2007, INTERSPEECH.

[3]  Daniel Povey,et al.  Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[5]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[6]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[7]  Shrikanth Narayanan,et al.  ASCII based transcription systems for languages with the Arabic script: the case of Persian , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[8]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[9]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[10]  S. Matsoukas,et al.  Improved speaker adaptation using speaker dependent feature projections , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[11]  Rohit Prasad,et al.  The BBN 2007 displayless English/iraqi speech-to-speech translation system , 2007, INTERSPEECH.

[12]  Andreas Stolcke,et al.  Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class-Dependent Mixtures , 2003, NAACL.

[13]  Vic Barnett,et al.  Sample Survey Principles and Methods , 1991 .

[14]  Rohit Prasad,et al.  Recent improvements and performance analysis of ASR and MT in a speech-to-speech translation system , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.