论文信息 - A Purely Monotonic Approach to Machine Translation for Similar Languages

A Purely Monotonic Approach to Machine Translation for Similar Languages

This paper investigates the effect of taking a strictly monotonic approach to machine translation for a restricted set of suitable language pairs. We studied the effect of decoding monotonically for a set of language pairs which has similar word order characteristics and found that for some language pairs - namely language pairs where both languages are in SOV order - there was almost no difference in machine translation quality. The results of this experiment motivated the extension of the monotonic approach into the alignment stage of the training. We used a Bayesian non-parametric aligner that has been shown to out-perform GIZA++ in combination with the grow-diag-final- and heuristic on transliteration data. Our results show that the monotonic aligner was able to match the performance of the GIZA++ baseline, and gains in translation performance were obtained by integrating both aligners into the systems.

Eiichiro Sumita | Yoshinori Sagisaka | Andrew M. Finch | Ye Kyaw Thu

[1] Eiichiro Sumita,et al. A Bayesian model of bilingual segmentation for transliteration , 2010, IWSLT.

[2] Fei Xia,et al. Improving a Statistical MT System with Automatically Learned Rewrite Patterns , 2004, COLING.

[3] Mi-Young Kim,et al. Transliteration Generation and Mining with Limited Training Resources , 2010, NEWS@ACL.

[4] Wolfgang Macherey,et al. Lattice-based Minimum Error Rate Training for Statistical Machine Translation , 2008, EMNLP.

[5] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[6] Eiichiro Sumita,et al. Phrase-based Machine Transliteration , 2008, IJCNLP.

[7] Karthik Gali,et al. Modeling Machine Transliteration as a Phrase Based Statistical Machine Translation Problem , 2009, NEWS@IJCNLP.

[8] Masao Utiyama,et al. Post-ordering by Parsing for Japanese-English Statistical Machine Translation , 2012, ACL.

[9] Kevin Duh,et al. Post-ordering in Statistical Machine Translation , 2011, MTSUMMIT.

[10] Eiichiro Sumita,et al. Creating corpora for speech-to-speech translation , 2003, INTERSPEECH.

[11] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[13] Grzegorz Kondrak,et al. Joint Processing and Discriminative Training for Letter-to-Phoneme Conversion , 2008, ACL.

[14] Sara Noeman. Language Independent Transliteration System Using Phrase-based SMT Approach on Substrings , 2009, NEWS@IJCNLP.