论文信息 - A Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation - 字舞流文

A Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation

This paper explores a simple and effective unified framework for incorporating soft linguistic reordering constraints into a hierarchical phrase-based translation system: 1) a syntactic reordering model that explores reorderings for context free grammar rules; and 2) a semantic reordering model that focuses on the reordering of predicate-argument structures. We develop novel features based on both models and use them as soft constraints to guide the translation process. Experiments on Chinese-English translation show that the reordering approach can significantly improve a state-of-the-art hierarchical phrase-based translation system. However, the gain achieved by the semantic reordering model is limited in the presence of the syntactic reordering model, and we therefore provide a detailed analysis of the behavior differences between the two.

Philip Resnik | Junhui Li | Hal Daumé | Yuval Marton

[1] Philipp Koehn,et al. Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation , 2011, EMNLP.

[2] Ming Zhou,et al. A Probabilistic Approach to Syntax-based Reordering for Statistical Machine Translation , 2007, ACL.

[3] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[4] Hermann Ney,et al. Advancements in Reordering Models for Statistical Machine Translation , 2013, ACL.

[5] Philipp Koehn,et al. Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[6] Hermann Ney,et al. A Phrase Orientation Model for Hierarchical Machine Translation , 2013, WMT@ACL.

[7] Bowen Zhou,et al. Two-Neighbor Orientation Model with Cross-Boundary Global Contexts , 2013, ACL.

[8] Jimmy J. Lin,et al. Mr. MIRA: Open-Source Large-Margin Structured Learning on MapReduce , 2013, ACL.

[9] Khalil Sima'an,et al. Learning Hierarchical Translation Structure with Linguistic Annotations , 2011, ACL.

[10] Fei Xia,et al. Improving a Statistical MT System with Automatically Learned Rewrite Patterns , 2004, COLING.

[11] M. A. R T A P A L,et al. The Penn Chinese TreeBank: Phrase structure annotation of a large corpus , 2005, Natural Language Engineering.

[12] Peng Xu,et al. Using a Dependency Parser to Improve SMT for Subject-Object-Verb Languages , 2009, NAACL.

[13] Hwee Tou Ng,et al. Joint Syntactic and Semantic Parsing of Chinese , 2010, ACL.

[14] David Chiang,et al. Learning to Translate with Source and Target Syntax , 2010, ACL.

[15] Chao Wang,et al. Chinese Syntactic Reordering for Statistical Machine Translation , 2007, EMNLP.

[16] Daniel Gildea,et al. The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[17] Nianwen Xue,et al. Adding semantic roles to the Chinese Treebank , 2009, Natural Language Engineering.

[18] Pascale Fung,et al. Semantic Roles for SMT: A Hybrid Two-Pass Model , 2009, NAACL.

[19] Ding Liu,et al. Semantic Role Features for Machine Translation , 2010, COLING.

[20] Hermann Ney,et al. Improved Statistical Alignment Models , 2000, ACL.

[21] Haizhou Li,et al. Topological Ordering of Function Words in Hierarchical Phrase-based Translation , 2009, ACL/IJCNLP.

[22] Stephan Vogel,et al. Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation , 2013, ACL.

[23] Vladimir Eidelman,et al. cdec: A Decoder, Alignment, and Learning Framework for Finite- State and Context-Free Translation Models , 2010, ACL.

[24] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[25] Karthik Visweswariah,et al. Syntax Based Reordering with Automatically Derived Rules for Improved Statistical Machine Translation , 2010, COLING.

[26] Slav Petrov,et al. Source-Side Classifier Preordering for Machine Translation , 2013, EMNLP.

[27] Rabih Zbib,et al. Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation , 2013, EMNLP.

[28] Kevin Duh,et al. Extracting Pre-ordering Rules from Predicate-Argument Structures , 2011, IJCNLP.

[29] David Chiang,et al. Hierarchical Phrase-Based Translation , 2007, CL.

[30] Yu Zhou,et al. Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation , 2013, ACL.

[31] Philipp Koehn,et al. Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[32] Philip Resnik,et al. Modeling Syntactic and Semantic Structures in Hierarchical Phrase-based Translation , 2013, HLT-NAACL.

[33] Daniel Jurafsky,et al. A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005 , 2005, IJCNLP.

[34] Nenghai Yu,et al. A Ranking-based Approach to Word Reordering for Statistical Machine Translation , 2012, ACL.

[35] Khalil Sima'an,et al. Context-Sensitive Syntactic Source-Reordering by Statistical Transduction , 2011, IJCNLP.

[36] Colin Cherry. Improved Reordering for Phrase-Based Translation using Sparse Features , 2013, HLT-NAACL.

[37] Sophia Ananiadou,et al. Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty , 2009, ACL.

[38] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[39] Dmitriy Genzel,et al. Automatically Learning Source-side Reordering Rules for Large Scale Machine Translation , 2010, COLING.

[40] Dan Klein,et al. Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[41] Philip Resnik,et al. Soft Syntactic Constraints for Hierarchical Phrased-Based Translation , 2008, ACL.

[42] Xiaoqiang Luo,et al. Constituent Reordering and Syntax Models for English-to-Japanese Statistical Machine Translation , 2010, COLING.

[43] Niyu Ge. A Direct Syntax-Driven Reordering Model for Phrase-Based Machine Translation , 2010, HLT-NAACL.

[44] Philip Resnik,et al. Online Large-Margin Training of Syntactic and Structural Translation Features , 2008, EMNLP.

[45] Haizhou Li,et al. Modeling the Translation of Predicate-Argument Structure for SMT , 2012, ACL.