论文信息 - A Systematic Analysis of Translation Model Search Spaces - 字舞流文

A Systematic Analysis of Translation Model Search Spaces

Translation systems are complex, and most metrics do little to pinpoint causes of error or isolate system differences. We use a simple technique to discover induction errors, which occur when good translations are absent from model search spaces. Our results show that a common pruning heuristic drastically increases induction error, and also strongly suggest that the search spaces of phrase-based and hierarchical phrase-based models are highly overlapping despite the well known structural differences.

Philipp Koehn | Adam Lopez | Michael Auli | Hieu Hoang

[1] Franz Josef Och,et al. A Systematic Comparison of Phrase-Based, Hierarchical and Syntax-Augmented Statistical MT , 2008, COLING.

[2] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[3] Lane Schwartz,et al. Multi-Source Translation Methods , 2008, AMTA.

[4] Daniel Marcu,et al. Fast and optimal decoding for machine translation , 2004, Artif. Intell..

[5] Daniel Gildea,et al. Extracting Synchronous Grammar Rules From Word-Level Alignments in Linear Time , 2008, COLING.

[6] Necip Fazil Ayan,et al. Going Beyond AER: An Extensive Analysis of Word Alignments and Their Impact on MT , 2006, ACL.

[7] Adam Lopez,et al. Translation as Weighted Deduction , 2009, EACL.

[8] Thomas L. Griffiths,et al. A fully Bayesian approach to unsupervised part-of-speech tagging , 2007, ACL.

[9] David Chiang,et al. Hierarchical Phrase-Based Translation , 2007, CL.

[10] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[11] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[12] Keith textscHall,et al. Comparing Reordering Constraints for SMT Using Efficient BLEU Oracle Computation , 2007, HLT-NAACL 2007.

[13] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[14] Adam Lopez. Tera-Scale Translation Models via Pattern Matching , 2008, COLING.

[15] Phil Blunsom,et al. Probabilistic Inference for Machine Translation , 2008, EMNLP.

[16] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[17] I. Dan Melamed,et al. Empirical Lower Bounds on the Complexity of Translational Equivalence , 2006, ACL.

[18] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[19] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[20] Miles Osborne,et al. Statistical Machine Translation , 2010, Encyclopedia of Machine Learning and Data Mining.