Learning a Translation Model from Word Lattices

Translation models have been used to improve automatic speech recognition when speech input is paired with a written translation, primarily for the task of computer-aided translation. Existing approaches require large amounts of parallel text for training the translation models, but for many language pairs this data is not available. We propose a model for learning lexical translation parameters directly from the word lattices for which a transcription is sought. The model is expressed through composition of each lattice with a weighted finite-state transducer representing the translation model, where inference is performed by sampling paths through the composed finitestate transducer. We show consistent word error rate reductions in two datasets, using between just 20 minutes and 4 hours of speech input, additionally outperforming a translation model trained on the 1-best path.

[1]  João Paulo da Silva Neto,et al.  Recovery of acronyms, out-of-lattice words and pronunciations from parallel multilingual speech , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[2]  Robert L. Mercer,et al.  Automatic speech recognition in machine-aided translation , 1994, Comput. Speech Lang..

[3]  Enrique Vidal,et al.  Finite-state speech-to-speech translation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Xiaobo Ren,et al.  Translation Analysis and Translation Automation , 1993, TMI.

[5]  A. Waibel,et al.  Speech translation enhanced automatic speech recognition , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[6]  Raymond W. M. Ng,et al.  Adaptation of lecture speech recognition system with machine translation output , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Alexander H. Waibel,et al.  Spoken language translation from parallel speech audio: Simultaneous interpretation as SLT training data , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Hermann Ney,et al.  On the integration of speech recognition and statistical machine translation , 2005, INTERSPEECH.

[9]  Matt Post,et al.  Improved speech-to-text translation with the Fisher and Callhome Spanish-English speech translation corpus , 2013, IWSLT.

[10]  Alexander H. Waibel,et al.  Training speech translation from audio recordings of interpreter-mediated communication , 2013, Comput. Speech Lang..

[11]  João Paulo da Silva Neto,et al.  Parallel combination of multilingual speech streams for improved ASR , 2012, INTERSPEECH.

[12]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[13]  Francisco Casacuberta,et al.  Computer-assisted translation using speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Hermann Ney,et al.  Speech translation: coupling of recognition and translation , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[15]  Steven Bird,et al.  Aikuma: A Mobile App for Collaborative Language Documentation , 2014 .

[16]  Alexander H. Waibel,et al.  Automatic translation from parallel speech: Simultaneous interpretation as MT training data , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[17]  Richard C. Rose,et al.  Integration of Statistical Models for Dictation of Document Translations in a Machine-Aided Human Translation Task , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Ruhi Sarikaya,et al.  Improving Statistical Machine Translation Using Bayesian Word Alignment and Gibbs Sampling , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Hermann Ney,et al.  Integration of Speech Recognition and Machine Translation in Computer-Assisted Translation , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Joris Pelemans,et al.  Efficient language model adaptation for automatic speech recognition of spoken translations , 2015, INTERSPEECH.

[21]  A. Black,et al.  Parallel combination of speech streams for improved ASR , 2012 .

[22]  政子 鶴岡,et al.  1998 IEEE International Conference on SMCに参加して , 1998 .

[23]  Pascual Martínez-Gómez,et al.  On multimodal interactive machine translation using speech recognition , 2011, ICMI '11.

[24]  Tatsuya Kawahara,et al.  Bayesian Learning of a Language Model from Continuous Speech , 2012, IEICE Trans. Inf. Syst..

[25]  Alta de Waal,et al.  Woefzela - An Open-Source Platform for ASR Data Collection in the Developing World , 2011, INTERSPEECH.

[26]  Richard C. Rose,et al.  Efficient integration of translation and speech models in dictation based machine aided human translation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Hermann Ney,et al.  Some approaches to statistical and finite-state speech-to-speech translation , 2004, Comput. Speech Lang..

[28]  Murat Saraclar,et al.  Bayesian Word Alignment for Statistical Machine Translation , 2011, ACL.

[29]  Jun-ichi Fukumoto,et al.  Bayesian Word Alignment and Phrase Table Training for Statistical Machine Translation , 2013, IEICE Trans. Inf. Syst..