Bilingual Markov Reordering Labels for Hierarchical SMT

Earlier work on labeling Hiero grammars with monolingual syntax reports improved performance, suggesting that such labeling may impact phrase reordering as well as lexical selection. In this paper we explore the idea of inducing bilingual labels for Hiero grammars without using any additional resources other than original Hiero itself does. Our bilingual labels aim at capturing salient patterns of phrase reordering in the training parallel corpus. These bilingual labels originate from hierarchical factorizations of the word alignments in Hiero’s own training data. In this paper we take a Markovian view on synchronous top-down derivations over these factorizations which allows us to extract 0th- and 1st-order bilingual reordering labels. Using exactly the same training data as Hiero we show that the Markovian interpretation of word alignment factorization offers major benefits over the unlabeled version. We report extensive experiments with strict and soft bilingual labeled Hiero showing improved performance up to 1 BLEU points for Chinese-English and about 0.1 BLEU points for German-English.

[1]  Alon Lavie,et al.  Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.

[2]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[3]  Jörg Tiedemann,et al.  Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.

[4]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[5]  Yang Liu,et al.  An Orientation Model for Hierarchical Phrase-Based Translation , 2011, 2011 International Conference on Asian Language Processing.

[6]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[7]  Christopher D. Manning,et al.  Optimizing Chinese Word Segmentation for Machine Translation Performance , 2008, WMT@ACL.

[8]  Bowen Zhou,et al.  Prior Derivation Models For Formally Syntax-Based Translation Using Linguistically Syntactic Parsing and Tree Kernels , 2008, SSST@ACL.

[9]  Dekai Wu,et al.  Machine Translation with a Stochastic Grammatical Channel , 1998, COLING-ACL.

[10]  Daniel Gildea,et al.  Extracting Synchronous Grammar Rules From Word-Level Alignments in Linear Time , 2008, COLING.

[11]  Khalil Sima,et al.  Hierarchical Alignment Trees : A Recursive Factorization of Reordering in Word Alignments with Empirical Results , 2013 .

[12]  Christoph Tillmann,et al.  A Unigram Orientation Model for Statistical Machine Translation , 2004, NAACL.

[13]  Khalil Sima'an,et al.  Learning Hierarchical Translation Structure with Linguistic Annotations , 2011, ACL.

[14]  Alon Lavie,et al.  Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems , 2011, WMT@EMNLP.

[15]  Hermann Ney,et al.  A Phrase Orientation Model for Hierarchical Machine Translation , 2013, WMT@ACL.

[16]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[17]  Andreas Eisele,et al.  MultiUN: A Multilingual Corpus from United Nation Documents , 2010, LREC.

[18]  Matt Post,et al.  Joshua 4.0: Packing, PRO, and Paraphrases , 2012, WMT@NAACL-HLT.

[19]  Haitao Mi,et al.  Forest-based Translation Rule Extraction , 2008, EMNLP.

[20]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[21]  Philip Resnik,et al.  Soft syntactic constraints for Arabic–English hierarchical phrase-based translation , 2011, Machine Translation.

[22]  Stephan Vogel,et al.  Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation , 2013, ACL.

[23]  David Chiang,et al.  Learning to Translate with Source and Target Syntax , 2010, ACL.

[24]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[25]  Liang Huang,et al.  A Syntax-Directed Translator with Extended Domain of Locality , 2006 .

[26]  A. Mood,et al.  The statistical sign test. , 1946, Journal of the American Statistical Association.

[27]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[28]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[29]  M. Mylonakis Learning the latent structure of translation , 2012 .

[30]  David Chiang,et al.  An Introduction to Synchronous Grammars , 2006 .

[31]  Noah A. Smith,et al.  Preference Grammars: Softening Syntactic Constraints to Improve Statistical Machine Translation , 2009, NAACL.

[32]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[33]  Khalil Sima'an,et al.  Hierarchical Alignment Decomposition Labels for Hiero Grammar Rules , 2013, SSST@NAACL-HLT.

[34]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[35]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[36]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[37]  Yang Liu,et al.  Tree-to-String Alignment Template for Statistical Machine Translation , 2006, ACL.

[38]  Alexandra Birch,et al.  LRscore for Evaluating Lexical and Reordering Quality in MT , 2010, WMT@ACL.

[39]  Alon Lavie,et al.  Improving Syntax-Augmented Machine Translation by Coarsening the Label Set , 2013, NAACL.

[40]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[41]  Qun Liu,et al.  Forest-Based Translation , 2008, ACL.