The ITC-irst SMT system for IWSLT 2006

This paper reports on the participation of ITC-irst to the evaluation campaign of the International Workshop on Spoken Language Translation 2006. Our two-pass system is the evolution of the one we employed for the 2005 campaign: in the first pass, an N-best list of translations is generated for each source sentence by means of a beam-search decoder; in the second pass, N-best lists are rescored and reranked exploiting additional feature functions. Main updates brought to the 2005 system involve novel additional features which are here described. Results on development sets are analyzed and commented.

[1]  Volker Steinbiss,et al.  A word graph based N-best search in continuous speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2]  Marcello Federico,et al.  Improving Phrase-Based Statistical Translation Through Combination of Word Alignments , 2006, FinTAL.

[3]  Hermann Ney,et al.  N-Gram Posterior Probabilities for Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[4]  Mauro Cettolo,et al.  The ITC-irst statistical machine translation system for IWSLT-2004 , 2004, IWSLT.

[5]  Marcello Federico,et al.  A Look inside the ITC-irst SMT System , 2005, MTSUMMIT.

[6]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[7]  I. Dan Melamed,et al.  Models of translation equivalence among words , 2000, CL.

[8]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[9]  Hermann Ney,et al.  Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation , 2003, CL.

[10]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[11]  Mauro Cettolo,et al.  Reordering rules for phrase-based statistical machine translation , 2006, IWSLT.

[12]  Stanley F. Chen,et al.  An empirical study of smoothing techniques for language modeling , 1999 .

[13]  Marcello Federico,et al.  A word-to-phrase statistical translation model , 2005, TSLP.

[14]  Eiichiro Sumita,et al.  Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversations in the Real World , 2002, LREC.

[15]  Mauro Cettolo,et al.  Minimum error training of log-linear translation models , 2004, IWSLT.

[16]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.