Unsupervised Word Alignment Using Frequency Constraint in Posterior Regularized EM

Generative word alignment models, such as IBM Models, are restricted to oneto-many alignment, and cannot explicitly represent many-to-many relationships in a bilingual text. The problem is partially solved either by introducing heuristics or by agreement constraints such that two directional word alignments agree with each other. In this paper, we focus on the posterior regularization framework (Ganchev et al., 2010) that can force two directional word alignment models to agree with each other during training, and propose new constraints that can take into account the difference between function words and content words. Experimental results on French-to-English and Japanese-to-English alignment tasks show statistically significant gains over the previous posterior regularization baseline. We also observed gains in Japanese-toEnglish translation tasks, which prove the effectiveness of our methods under grammatically different language pairs.

[1]  John DeNero,et al.  A Constrained Viterbi Relaxation for Bidirectional Word Alignment , 2014, ACL.

[2]  Eiichiro Sumita,et al.  Overview of the Patent Machine Translation Task at the NTCIR-10 Workshop , 2011, NTCIR.

[3]  Sadao Kurohashi,et al.  Alignment by Bilingual Generation and Monolingual Derivation , 2012, COLING.

[4]  George F. Foster,et al.  Batch Tuning Strategies for Statistical Machine Translation , 2012, NAACL.

[5]  Arianna Bisazza,et al.  Cutting the Long Tail: Hybrid Language Models for Translation Style Adaptation , 2012, EACL.

[6]  Mark Hopkins,et al.  Tuning as Ranking , 2011, EMNLP.

[7]  John DeNero,et al.  Model-Based Aligner Combination Using Dual Decomposition , 2011, ACL.

[8]  Taro Watanabe,et al.  An Unsupervised Model for Joint Phrase Alignment and Extraction , 2011, ACL.

[9]  Graham Neubig,et al.  Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis , 2011, ACL.

[10]  Ben Taskar,et al.  Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[11]  John DeNero,et al.  Sampling Alignment Structure under a Bayesian Translation Model , 2008, EMNLP.

[12]  Ben Taskar,et al.  Better Alignments = Better Translations? , 2008, ACL.

[13]  Alexander M. Fraser,et al.  Squibs and Discussions: Measuring Word Alignment Quality for Statistical Machine Translation , 2007, CL.

[14]  Haizhou Li,et al.  Ordering Phrases with Function Words , 2007, ACL.

[15]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[16]  Ben Taskar,et al.  Alignment by Agreement , 2006, NAACL.

[17]  Hermann Ney,et al.  Symmetric Word Alignments for Statistical Machine Translation , 2004, COLING.

[18]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[19]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[20]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[21]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[22]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[23]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[24]  Danqi Chen,et al.  of the Association for Computational Linguistics: , 2001 .

[25]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[26]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.