Exploiting syntactic relationships in a phrase-based decoder: an exploration

Phrase-based decoding is conceptually simple and straightforward to implement, at the cost of drastically oversimplified reordering models. Syntactically aware models make it possible to capture linguistically relevant relationships in order to improve word order, but they can be more complex to implement and optimise. In this paper, we explore a new middle ground between phrase-based and syntactically informed statistical MT, in the form of a model that supplements conventional, non-hierarchical phrase-based techniques with linguistically informed reordering based on syntactic dependency trees. The key idea is to exploit linguistically-informed hierchical structures only for those dependencies that cannot be captured within a single flat phrase. For very local dependencies we leverage the success of conventional phrase-based approaches, which provide a sequence of target-language words appropriately ordered and ready-made with any agreement morphology. Working with dependency trees rather than constituency trees allows us to take advantage of the flexibility of phrase-based systems to treat non-constituent fragments as phrases. We do impose a requirement—that the fragment be a novel sort of “dependency constituent”—on what can be translated as a phrase, but this is much weaker than the requirement that phrases be traditional linguistic constituents, which has often proven too restrictive in MT systems.

[1]  Philip Resnik,et al.  Online Large-Margin Training of Syntactic and Structural Translation Features , 2008, EMNLP.

[2]  Philip Resnik,et al.  Extending Phrase-Based Decoding with a Dependency-Based Reordering Model , 2009 .

[3]  Dekai Wu,et al.  Machine Translation with a Stochastic Grammatical Channel , 1998, COLING-ACL.

[4]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[5]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[6]  Heidi Fox,et al.  Phrasal Cohesion and Statistical Machine Translation , 2002, EMNLP.

[7]  Kenji Kita,et al.  Spoken Language Translation System , 1993, IJCAI.

[8]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[9]  Christopher D. Manning,et al.  A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[10]  Daniel Gildea,et al.  Loosely Tree-Based Alignment for Machine Translation , 2003, ACL.

[11]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[12]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[13]  Jason Eisner,et al.  Learning Linear Ordering Problems for Better Translation , 2009, EMNLP.

[14]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[15]  Spyridon Matsoukas,et al.  Effective Use of Linguistic and Contextual Information for Statistical Machine Translation , 2009, EMNLP.

[16]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[17]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[18]  Philip Resnik,et al.  Evaluating Translational Correspondence using Annotation Projection , 2002, ACL.

[19]  Wolfgang Macherey,et al.  Lattice-based Minimum Error Rate Training for Statistical Machine Translation , 2008, EMNLP.

[20]  Martin Cmejrek,et al.  Prague Czech-English Dependency Treebank: Any Hopes for a Common Annotation Scheme? , 2004, FCP@NAACL-HLT.

[21]  Fei Xia,et al.  Improving a Statistical MT System with Automatically Learned Rewrite Patterns , 2004, COLING.

[22]  Jinxi Xu,et al.  A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.

[23]  Mari Ostendorf,et al.  Expected dependency pair match: predicting translation quality with expected syntactic structure , 2009, Machine Translation.

[24]  Bowen Zhou,et al.  Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions , 2010, EMNLP.

[25]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.