Length-Incremental Phrase Training for SMT

We present an iterative technique to generate phrase tables for SMT, which is based on force-aligning the training data with a modified translation decoder. Different from previous work, we completely avoid the use of a word alignment or phrase extraction heuristics, moving towards a more principled phrase generation and probability estimation. During training, we allow the decoder to generate new phrases on-the-fly and increment the maximum phrase length in each iteration. Experiments are carried out on the IWSLT 2011 Arabic-English task, where we are able to reach moderate improvements on a state-of-the-art baseline with our training method. The resulting phrase table shows only a small overlap with the heuristically extracted one, which demonstrates the restrictiveness of limiting phrase selection by a word alignment or heuristics. By interpolating the heuristic and the trained phrase table, we can improve over the baseline by 0.5% BLEU and 0.5% TER.

[1]  Dekai Wu,et al.  Principled Induction of Phrasal Bilexica , 2011, EAMT.

[2]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[3]  Mei-Yuh Hwang,et al.  Leave-One-Out Phrase Model Training for Large-Scale Deployment , 2012, WMT@NAACL-HLT.

[4]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[5]  Chris Quirk,et al.  An Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation , 2007, WMT@ACL.

[6]  Chris Dyer,et al.  A Gibbs Sampler for Phrasal Synchronous Grammar Induction , 2009, ACL.

[7]  Richard Zens,et al.  Phrase based statistical machine translation: models, search, raining , 2008 .

[8]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[9]  John DeNero,et al.  Why Generative Phrase Models Underperform Surface Heuristics , 2006, WMT@HLT-NAACL.

[10]  William D. Lewis,et al.  Intelligent Selection of Language Model Training Data , 2010, ACL.

[11]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[12]  Markus Freitag,et al.  Jane 2: Open Source Phrase-based and Hierarchical Statistical Machine Translation , 2012, COLING.

[13]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[14]  Li Deng,et al.  Maximum Expected BLEU Training of Phrase and Lexicon Translation Models , 2012, ACL.

[15]  John DeNero,et al.  Sampling Alignment Structure under a Bayesian Translation Model , 2008, EMNLP.

[16]  Ming Zhou,et al.  Forced Derivation Tree based Model Training to Statistical Machine Translation , 2012, EMNLP.

[17]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[18]  Hermann Ney,et al.  Training Phrase Translation Models with Leaving-One-Out , 2010, ACL.

[19]  Wolfgang Macherey,et al.  Lattice-based Minimum Error Rate Training for Statistical Machine Translation , 2008, EMNLP.

[20]  Taro Watanabe,et al.  An Unsupervised Model for Joint Phrase Alignment and Extraction , 2011, ACL.

[21]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[22]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[23]  Philipp Koehn,et al.  Constraining the Phrase-Based, Joint Probability Statistical Translation Model , 2006, WMT@HLT-NAACL.

[24]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[25]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[26]  Ben Taskar,et al.  An End-to-End Discriminative Approach to Machine Translation , 2006, ACL.

[27]  Jianfeng Gao,et al.  Domain Adaptation via Pseudo In-Domain Data Selection , 2011, EMNLP.

[28]  Khalil Sima'an,et al.  Phrase Translation Probabilities with ITG Priors and Smoothing as Learning Objective , 2008, EMNLP.

[29]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.