Analysing soft syntax features and heuristics for hierarchical phrase based machine translation.

Similar to phrase-based machine translation, hierarchical systems produce a large proportion of phrases, most of which are supposedly junk and useless for the actual translation. For the hierarchical case, however, the amount of extracted rules is an order of magnitude bigger. In this paper, we investigate several soft constraints in the extraction of hierarchical phrases and whether these help as additional scores in the decoding to prune unneeded phrases. We show the methods that help best.

[1]  PietraVincent J. Della,et al.  The mathematics of statistical machine translation , 1993 .

[2]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[3]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.

[4]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[5]  Qun Liu,et al.  Chinese Lexical Analysis Using Hierarchical Hidden Markov Model , 2003, SIGHAN.

[6]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[7]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[8]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[9]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[10]  Rafael E. Banchs,et al.  Tuning machine translation parameters with SPSA , 2006, IWSLT.

[11]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[12]  Daniel Marcu,et al.  SPMT: Statistical Machine Translation with Syntactified Target Language Phrases , 2006, EMNLP.

[13]  Daniel Marcu,et al.  What Can Syntax-Based MT Learn from Phrase-Based MT? , 2007, EMNLP.

[14]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[15]  Noah A. Smith,et al.  Proceedings of EMNLP , 2007 .

[16]  H. Ney,et al.  The RWTH machine translation system for IWSLT 2007 , 2007, IWSLT.

[17]  David Chiang,et al.  Forest Rescoring: Faster Decoding with Integrated Language Models , 2007, ACL.

[18]  Philip Resnik,et al.  Soft Syntactic Constraints for Hierarchical Phrased-Based Translation , 2008, ACL.

[19]  Hermann Ney,et al.  The RWTH machine translation system for IWSLT 2008. , 2008, IWSLT.