Sampling Alignment Structure under a Bayesian Translation Model

We describe the first tractable Gibbs sampling procedure for estimating phrase pair frequencies under a probabilistic model of phrase alignment. We propose and evaluate two nonparametric priors that successfully avoid the degenerate behavior noted in previous work, where overly large phrases memorize the training data. Phrase table weights learned under our model yield an increase in BLEU score over the word-alignment based heuristic estimates used regularly in phrase-based translation systems.

[1]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[2]  D. Aldous Exchangeability and related topics , 1985 .

[3]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[4]  H. Prosper Bayesian Analysis , 2000, hep-ph/0006356.

[5]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[6]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[7]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[8]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[9]  Philipp Koehn,et al.  Constraining the Phrase-Based, Joint Probability Statistical Translation Model , 2006, WMT@HLT-NAACL.

[10]  John DeNero,et al.  Why Generative Phrase Models Underperform Surface Heuristics , 2006, WMT@HLT-NAACL.

[11]  Yee Whye Teh,et al.  A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.

[12]  Thomas L. Griffiths,et al.  Contextual Dependencies in Unsupervised Word Segmentation , 2006, ACL.

[13]  Chris Quirk,et al.  An Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation , 2007, WMT@ACL.

[14]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.

[15]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[16]  Kevin Knight,et al.  Syntactic Re-Alignment Models for Machine Translation , 2007, EMNLP.

[17]  Dan Klein,et al.  Unsupervised Coreference Resolution in a Nonparametric Bayesian Model , 2007, ACL.

[18]  Christopher D. Manning,et al.  The Infinite Tree , 2007, ACL.

[19]  Colin Cherry,et al.  Inversion Transduction Grammar for Joint Phrasal Translation Modeling , 2007, SSST@HLT-NAACL.

[20]  Thomas L. Griffiths,et al.  Bayesian Inference for PCFGs via Markov Chain Monte Carlo , 2007, NAACL.

[21]  Regina Barzilay,et al.  Unsupervised Multilingual Learning for Morphological Segmentation , 2008, ACL.

[22]  Daniel Gildea,et al.  Bayesian Learning of Non-Compositional Phrases with Synchronous Parsing , 2008, ACL.

[23]  John DeNero,et al.  The Complexity of Phrase Alignment Problems , 2008, ACL.