The IRST English-Spanish translation system for european parliament speeches

This paper presents the spoken language translation system developed at FBK-irst during the TC-STAR project. The system integrates automatic speech recognition with machine translation through the use of confusion networks, which permit to represent a huge number of transcription hypotheses generated by the speech recognizer. Confusion networks are efficiently decoded by a statistical machine translation system which computes the most probable translation in the target language. This paper presents the whole architecture developed for the translation of political speeches held at the European Parliament, from English to Spanish and vice versa, and at the Spanish Parliament, from Spanish to English.

[1]  Jonathan G. Fiscus,et al.  A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[2]  Mark J. F. Gales,et al.  Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[3]  Fabio Brugnara,et al.  Improved automatic speech recognition through speaker normalization , 2006, Comput. Speech Lang..

[4]  Mauro Cettolo Porting an audio partitioner across domains , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Richard Zens,et al.  Speech Translation by Confusion Network Decoding , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6]  Fabio Brugnara,et al.  Adaptive training using simple target models [speech recognition applications] , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[8]  Marcello Federico,et al.  Broadcast news LM adaptation over time , 2004, Comput. Speech Lang..

[9]  Fabio Brugnara Context-dependent search in a context-independent network , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[11]  N. Bertoldi,et al.  A new decoder for spoken language translation based on confusion networks , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[12]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[13]  Mauro Cettolo,et al.  Efficient Handling of N-gram Language Models for Statistical Machine Translation , 2007, WMT@ACL.

[14]  Marcello Federico,et al.  Punctuating confusion networks for speech translation , 2007, INTERSPEECH.

[15]  Andreas Stolcke,et al.  Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..

[16]  Fabio Brugnara,et al.  Integration of Heteroscedastic Linear Discriminant Analysis (HLDA) Into Adaptive Training , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[17]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[18]  Marcello Federico,et al.  How Many Bits Are Needed To Store Probabilities for Phrase-Based Translation? , 2006, WMT@HLT-NAACL.