A Feature-rich Supervised Word Alignment Model for Phrase-based Statistical Machine Translation

Word alignment plays an important role in statistical machine translation (SMT) systems. The output of word alignment can be used to build a phrase table, which is the core model in the decoding of new sentences. Most current SMT systems use GIZA++, a generative model, to automatically align words from sentence-aligned parallel corpora. GIZA++ works well when large sentence-aligned corpora are used. However, it is difficult to encode syntactic and lexical features useful for handling sparse data and unseen words, such as POS tags, affixes, lemmas, etc., using generative models. A discriminative model such as conditional random fields (CRF) can solve this problem. We treat word alignment as a labelling problem, and encode the syntactic, lexical, and contextual features. Our experiments were conducted using a 35K Chinese-English hand-aligned corpus. Our model gives better word alignment results than GIZA++ by 7% AER. Finally, we also prove that 2% higher BLEU score can be obtained with phrase-based SMT systems when our alignment models are used.

[1]  Yang Liu,et al.  Log-Linear Models for Word Alignment , 2005, ACL.

[2]  Ben Taskar,et al.  Better Alignments = Better Translations? , 2008, ACL.

[3]  Ben Taskar,et al.  A Discriminative Matching Approach to Word Alignment , 2005, HLT.

[4]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[5]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[6]  Salim Roukos,et al.  A Maximum Entropy Word Aligner for Arabic-English Machine Translation , 2005, HLT.

[7]  Robert C. Moore A Discriminative Framework for Bilingual Word Alignment , 2005, HLT.

[8]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[9]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[10]  Hua Wu,et al.  Boosting Statistical Word Alignment Using Labeled and Unlabeled Data , 2006, ACL.

[11]  Phil Blunsom,et al.  Discriminative Word Alignment with Conditional Random Fields , 2006, ACL.

[12]  Zhao Hongmei A Guideline for Chinese-English Word Alignment , 2009 .

[13]  Eiichiro Sumita,et al.  Comparative study on corpora for speech translation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[15]  Alexander M. Fraser,et al.  Squibs and Discussions: Measuring Word Alignment Quality for Statistical Machine Translation , 2007, CL.

[16]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[17]  Alexander M. Fraser,et al.  Semi-Supervised Training for Statistical Word Alignment , 2006, ACL.