Phrase-boundary model for statistical machine translation

We proposed an SMT model labeling nonterminals with boundary word classes of phrases.Word classes can be defined by POS tags and automatic word clustering.The proposed model was filtered considering alignment pattern of phrase pairs.Limited patterns of rules extracted from phrase pairs that are decomposable. This paper proposes a new probabilistic synchronous context-free grammar model for statistical machine translation. The model labels nonterminals with classes of boundary words on the target side of aligned phrase pairs. Labeling of the rules is performed with coarse grained and fine grained nonterminals using POS tags and word clusters trained on the target language corpus. Considering the large size of the proposed model due to the diversity of nonterminals, we have also proposed a novel approach for filtered rule extraction based on the alignment pattern of phrase pairs. Using limited patterns of rules, the extraction of hierarchical rules gets restricted from phrase pairs that are decomposable to two aligned subphrases. The proposed filtered rule extraction decreases the model size and the decoding time considerably with no significant impact on the translation quality. Using BLEU as a metric in our experiments, the proposed model achieved a notable improvement rate over the state-of-the-art hierarchical phrase-based model in the translation from Persian, French and Spanish to English language. This is applicable for all languages, even under-resourced ones having no linguistic tools.

[1]  Franz Josef Och,et al.  An Efficient Method for Determining Bilingual Word Classes , 1999, EACL.

[2]  Bowen Zhou,et al.  Prior Derivation Models For Formally Syntax-Based Translation Using Linguistically Syntactic Parsing and Tree Kernels , 2008, SSST@ACL.

[3]  Matt Post,et al.  Joshua 5.0: Sparser, Better, Faster, Server , 2013, WMT@ACL.

[4]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[5]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[6]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[7]  Nathan Schneider,et al.  Association for Computational Linguistics: Human Language Technologies , 2011 .

[8]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[9]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[10]  Andy Way,et al.  CCG augmented hierarchical phrase-based machine translation , 2010, IWSLT.

[11]  Franz Josef Och,et al.  A Systematic Comparison of Phrase-Based, Hierarchical and Syntax-Augmented Statistical MT , 2008, COLING.

[12]  Gholamreza Haffari,et al.  Compact rule extraction for hierarchical phrase-based translation , 2012, AMTA 2012.

[13]  Stephan Vogel,et al.  A Word-Class Approach to Labeling PSCFG Rules for Machine Translation , 2011, ACL.

[14]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[15]  Andy Way,et al.  Supertags as source language context in hierarchical phrase-based SMT , 2010, AMTA 2010.

[16]  Dan Klein,et al.  Faster and Smaller N-Gram Language Models , 2011, ACL.

[17]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[18]  Hermann Ney,et al.  If i only had a parser: poor man's syntax for hierarchical machine translation , 2010, IWSLT.

[19]  Markus Freitag,et al.  Discriminative Reordering Extensions for Hierarchical Phrase-Based Machine Translation , 2012, EAMT.

[20]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[21]  Colin Cherry Improved Reordering for Phrase-Based Translation using Sparse Features , 2013, HLT-NAACL.

[22]  Hae-Chang Rim,et al.  Translation Model Size Reduction for Hierarchical Phrase-based Statistical Machine Translation , 2012, ACL.

[23]  Gholamreza Haffari,et al.  Bayesian Extraction of Minimal SCFG Rules for Hierarchical Phrase-based Translation , 2011, WMT@EMNLP.

[24]  Hao Yu,et al.  Discarding monotone composed rule for hierarchical phrase-based statistical machine translation , 2009, IUCS '09.

[25]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[26]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[27]  Qun Liu,et al.  Improving Statistical Machine Translation using Lexicalized Rule Selection , 2008, COLING.

[28]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[29]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.

[30]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[31]  Chris Callison-Burch,et al.  Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation , 2009, ACL.

[32]  William J. Byrne,et al.  Rule Filtering by Pattern for Efficient Hierarchical Translation , 2009, EACL.

[33]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[34]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[35]  Philipp Koehn,et al.  In Proceedings of the Tenth Conference of the Association for Machine Translation in the Americas (AMTA) , 2012 .

[36]  Andreas Eisele,et al.  MultiUN: A Multilingual Corpus from United Nation Documents , 2010, LREC.

[37]  Taro Watanabe,et al.  Left-to-Right Target Generation for Hierarchical Phrase-Based Translation , 2006, ACL.