Unsupervised training of maximum-entropy models for lexical selection in rule-based machine translation

This article presents a method of training maximum-entropy models to perform lexical selection in a rule-based machine translation system. The training method described is unsupervised; that is, it does not require any annotated corpus. The method uses source-language monolingual corpora, the machine translation (MT) system in which the models are integrated, and a statistical target-language model. Using the MT system, the sentences in the sourcelanguage corpus are translated in all possible ways according to the different translation equivalents in the bilingual dictionary of the system. These translations are then scored on the target-language model and the scores are normalised to provide fractional counts for training source-language maximum-entropy lexical-selection models. We show that these models can perform equally well, or better, than using the target-language model directly for lexical selection, at a substantially reduced computational cost.

[1]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[2]  Daphne Koller,et al.  Word-Sense Disambiguation for Machine Translation , 2005, HLT.

[3]  Francis M. Tyers,et al.  Development of a free Basque to Spanish machine translation system , 2009 .

[4]  Hermann Ney,et al.  EM Decipherment for Large Vocabularies , 2014, ACL.

[5]  Maite Melero Dealing with Bilingual Divergences in MT using Target Language N-gram Models , 2001 .

[6]  Kevin Knight,et al.  Deciphering Foreign Language , 2011, ACL.

[7]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[8]  Nancy Ide,et al.  Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art , 1998, Comput. Linguistics.

[9]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[10]  Francis M. Tyers,et al.  Flexible finite-state lexical selection for rule-based machine translation , 2012, EAMT.

[11]  Mikel L. Forcada,et al.  Using target-language information to train part-of-speech taggers for machine translation , 2008, Machine Translation.

[12]  Zdenek Zabokrtský,et al.  Maximum Entropy Translation Model in Dependency-Based MT Framework , 2010, WMT@ACL.

[13]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[14]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[15]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[16]  Natalia Zinovjeva,et al.  Learning Sense Disambiguation Rules for Machine Translation , 2000 .

[17]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[18]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[19]  Francis M. Tyers,et al.  Apertium: a free/open-source platform for rule-based machine translation , 2011, Machine Translation.

[20]  Francis M. Tyers Rule-based Breton to French machine translation , 2010, EAMT.

[21]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[22]  Alon Itai,et al.  Word Sense Disambiguation Using a Second Language Monolingual Corpus , 1994, CL.