Reordering on Spanish-Basque SMT

In this work we have deal with the reordering problem in Spanish-Basque statistical machine translation, comparing three different approaches and analyzing their strength and weakness. Tested approaches cover the more usual techniques: lexicalized reordering implemented on Moses, preprocessing based on hand defined rules over the syntactic analysis of the source and statistical translation. According with the obtained results, the three reordering techniques improves the results of the baseline. We observe different behaviour at combining techniques. While the use of the Syntax-Based reordered corpus together with the lexicalized reordering get the best results, training the lexicalized reordering on the statistically reordered source does not improve the performance of the single methods.

[1]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[2]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[3]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[4]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[5]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[6]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[7]  Xavier Carreras,et al.  FreeLing: An Open-Source Suite of Language Analyzers , 2004, LREC.

[8]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[9]  Hermann Ney,et al.  POS-based Word Reorderings for Statistical Machine Translation , 2006, LREC.

[10]  Mauro Cettolo,et al.  Reordering rules for phrase-based statistical machine translation , 2006, IWSLT.

[11]  Marta R. Costa-jussà,et al.  Statistical Machine Reordering , 2006, EMNLP.

[12]  Reordering via n-best lists for Spanish-Basque translation , 2007, TMI.

[13]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[14]  Hermann Ney,et al.  Chunk-Level Reordering of Source Language Sentences with Automatically Learned Rules for Statistical Machine Translation , 2007, SSST@HLT-NAACL.

[15]  Pushpak Bhattacharyya,et al.  Simple Syntactic and Morphological Processing Can Help English-Hindi Statistical Machine Translation , 2008, IJCNLP.

[16]  Gorka Labaka,et al.  Relevance of Different Segmentation Options on Spanish-Basque SMT , 2009, EAMT.