论文信息 - Maximum-entropy word alignment and posterior-based phrase extraction for machine translation

Maximum-entropy word alignment and posterior-based phrase extraction for machine translation

One of the fundamental assumptions in statistical machine translation (SMT) is that the correspondence between a sentence and its translation can be explained in terms of an alignment between their words. Such alignment information is typically not observed in the parallel corpora used to build the phrase table of an SMT system. Therefore, it is customary to estimate a probabilistic model of the assumed hidden word alignment, which is then used to extract bilingual phrase pairs. In standard extraction heuristics, the alignment model is under-exploited as the only information used from the posterior distribution is the Viterbi best alignment. This is due to the high computational complexity of the IBM models, which are the de facto standard for computing these alignments. Note that these models have other limitations, including their asymmetry and their inability to integrate rich, feature-based, descriptions. We argue that refining the word alignment model in a discriminative maximum-entropy framework substantially improves the alignment quality. We also show that these improved alignments combined with efficient and accurate computation of the link posterior distributions can also improve the overall translation performance, especially when applying posterior-based extraction methods.

Alexandre Allauzen | François Yvon | Nadi Tomeh

[1] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[2] Shankar Kumar,et al. Minimum Bayes-Risk Word Alignments of Bilingual Texts , 2002, EMNLP.

[3] Rafael E. Banchs,et al. Discriminative Alignment Training without Annotated Data for Machine Translation , 2007, HLT-NAACL.

[4] Anders Søgaard. Can inversion transduction grammars generate hand alignments , 2010, EAMT.

[5] Necip Fazil Ayan,et al. A Maximum Entropy Approach to Combining Word Alignments , 2006, NAACL.

[6] Hermann Ney,et al. The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[7] Ron Kohavi,et al. Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[8] Nizar Habash,et al. Permission is granted to quote short excerpts and to reproduce figures and tables from this report, provided that the source of such material is fully acknowledged. Arabic Preprocessing Schemes for Statistical Machine Translation , 2006 .

[9] Hermann Ney,et al. AER: do we need to “improve” our alignments? , 2006, IWSLT.

[10] Chris Dyer,et al. Using a maximum entropy model to build segmentation lattices for MT , 2009, NAACL.

[11] Ben Taskar,et al. Learning Tractable Word Alignment Models with Complex Constraints , 2010, CL.

[12] Mirella Lapata,et al. Proceedings of ACL-08: HLT , 2008 .

[13] Yanjun Ma,et al. Tracking relevant alignment characteristics for machine translation , 2009 .

[14] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[15] Philipp Koehn,et al. Constraining the Phrase-Based, Joint Probability Statistical Translation Model , 2006, WMT@HLT-NAACL.

[16] John DeNero,et al. Discriminative Modeling of Extraction Sets for Machine Translation , 2010, ACL.

[17] Daniel Marcu,et al. Improved word alignments for statistical machine translation , 2007 .

[18] Hermann Ney,et al. Improved Statistical Alignment Models , 2000, ACL.

[19] Nizar Habash. Arabic Natural Language Processing , 2008 .

[20] Andrew McCallum,et al. Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[21] Noah A. Smith,et al. Wider Pipelines: N-Best Alignments and Parses in MT Training , 2008, AMTA.

[22] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[23] Kevin Small,et al. All links are not the same: evaluating word alignments for statistical machine translation , 2007, MTSUMMIT.

[24] Tiejun Zhao,et al. Bilingual Phrase Extraction from N-Best Alignments , 2006, First International Conference on Innovative Computing, Information and Control - Volume I (ICICIC'06).

[25] François Yvon,et al. Designing an Improved Discriminative Word Aligner , 2011, Int. J. Comput. Linguistics Appl..

[26] John DeNero,et al. Why Generative Phrase Models Underperform Surface Heuristics , 2006, WMT@HLT-NAACL.

[27] Nizar Habash,et al. Arabic Morphological Representations for Machine Translation , 2007 .

[28] Haitao Mi,et al. Forest-based Translation Rule Extraction , 2008, EMNLP.

[29] Alon Lavie,et al. Unsupervised Word Alignment with Arbitrary Features , 2011, ACL.

[30] William J. Byrne,et al. HMM Word and Phrase Alignment for Statistical Machine Translation , 2005, HLT.

[31] William J. Byrne,et al. Hierarchical Phrase-Based Translation Grammars Extracted from Alignment Posterior Probabilities , 2010, EMNLP.

[32] Alexandre Allauzen,et al. Discriminative Weighted Alignment Matrices For Statistical Machine Translation , 2011, EAMT.

[33] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[34] Philipp Koehn,et al. Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[35] Phil Blunsom,et al. Discriminative Word Alignment with Conditional Random Fields , 2006, ACL.

[36] Stephan Vogel,et al. Parallel Implementations of Word Alignment Tool , 2008, SETQALNLP.