Better Filtration and Augmentation for Hierarchical Phrase-Based Translation Rules

This paper presents a novel filtration criterion to restrict the rule extraction for the hierarchical phrase-based translation model, where a bilingual but relaxed well-formed dependency restriction is used to filter out bad rules. Furthermore, a new feature which describes the regularity that the source/target dependency edge triggers the target/source word is also proposed. Experimental results show that, the new criteria weeds out about 40% rules while with translation performance improvement, and the new feature brings another improvement to the baseline system, especially on larger corpus.

[1]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[2]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[3]  William J. Byrne,et al.  Rule Filtering by Pattern for Efficient Hierarchical Translation , 2009, EACL.

[4]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[5]  Spyridon Matsoukas,et al.  Effective Use of Linguistic and Contextual Information for Statistical Machine Translation , 2009, EMNLP.

[6]  Jinxi Xu,et al.  A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.

[7]  Qun Liu,et al.  Improving Statistical Machine Translation using Lexicalized Rule Selection , 2008, COLING.

[8]  Joakim Nivre,et al.  Deterministic Dependency Parsing of English Text , 2004, COLING.

[9]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[10]  Hermann Ney,et al.  Triplet Lexicon Models for Statistical Machine Translation , 2008, EMNLP.

[11]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[12]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[13]  Qun Liu,et al.  Reducing SMT Rule Table with Monolingual Key Phrase , 2009, ACL/IJCNLP.

[14]  Yuan Ding,et al.  Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars , 2005, ACL.

[15]  Hermann Ney,et al.  Comparison of Extended Lexicon Models in Search and Rescoring for SMT , 2009, HLT-NAACL.

[16]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[17]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[18]  Qun Liu,et al.  Bilingually-Constrained (Monolingual) Shift-Reduce Parsing , 2009, EMNLP.