This paper presents a method that can improve the translation quality of a phrase-based statistical machine translation system without the need for additional training data. The technique exploits the asymmetry of the phrase-table generation process during training. In our experiments we use the GIZA++ toolkit for alignment, and the phrase extraction utilities that are provided with the MOSES decoder. These tools are commonly used in the field, and serve as the benchmark by which other techniques are measured. Our experiments show that if the corpus’s word order (both source and target) is reversed during the word alignment/phrase extraction phase of the training, the resulting phrase table is significantly different to that generated from the un-reordered corpus. Typically only about 30-60% of the phrase-pairs are shared between the forwardand reverse-generated phrase tables. Our approach attempts to exploit this asymmetry by integrating these phrase-tables into a single larger table, and use this integrated phrase table for decoding. The phrase-table integration is done by linearly interpolation. The benefits of this approach are two-fold. Firstly, the larger number of phrases present in the integrated phrase-table allows for greater coverage of the test data. Secondly, phrases that occur in both tables receive contributions to their probability mass from both entries in the tables during the interpolation process. This effectively boosts the probability of the more reliable phrases that occur in both tables relative to less reliable phrases that occur in only one of the tables. To evaluate our approach we ran a total of 272 experiments on all languagepairings from a set of 17 languages, and evaluated using a set of seven machine translation evaluation metrics. Our training data consisted of approximately 160,000 sentence pairs from the ATR BTEC1 corpus. The test set was 5000 single-reference sentences drawn from the same sample. We show consistent gains in over 95% of our experiments, over baseline systems trained in the usual manner on unreversed training data.
[1]
Alon Lavie,et al.
METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments
,
2005,
IEEvaluation@ACL.
[2]
Michael Paul,et al.
Overview of the IWSLT06 evaluation campaign
,
2006,
IWSLT.
[3]
George R. Doddington,et al.
Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics
,
2002
.
[4]
Franz Josef Och,et al.
Minimum Error Rate Training in Statistical Machine Translation
,
2003,
ACL.
[5]
Eiichiro Sumita,et al.
Creating corpora for speech-to-speech translation
,
2003,
INTERSPEECH.
[6]
Salim Roukos,et al.
Bleu: a Method for Automatic Evaluation of Machine Translation
,
2002,
ACL.
[7]
M. J. Hunt.
Figures of merit for assessing connected-word recognisers
,
1990,
Speech Commun..
[8]
Philipp Koehn,et al.
Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models
,
2004,
AMTA.
[9]
Philipp Koehn,et al.
Moses: Open Source Toolkit for Statistical Machine Translation
,
2007,
ACL.
[10]
Ralph Weischedel,et al.
A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION
,
2005
.
[11]
Hermann Ney,et al.
Accelerated DP based search for statistical translation
,
1997,
EUROSPEECH.
[12]
Robert L. Mercer,et al.
The Mathematics of Statistical Machine Translation: Parameter Estimation
,
1993,
CL.