Incremental Word Re-Ordering and Article Generation: Its Application to Japanese-to-English Machine Translation

This paper introduces a novel word re-ordering model for statistical machine translation that employs a shift-reduce parser for inversion transduction grammars. The proposed model also solves article generation problems simultaneously with word reordering. We applied it to the post-ordering of phrase-based machine translation (PBMT) for Japanese-to-English patent translation tasks. Our experimental results suggest that our method achieves a significant improvement of +3.15 BLEU scores against 29.99 BLEU scores of the baseline PBMT system.

[1]  Daniel Gildea,et al.  Machine Translation as Lexicalized Parsing with Hooks , 2005, IWPT.

[2]  John DeNero,et al.  Inducing Sentence Structure from Parallel Corpora for Reordering , 2011, EMNLP.

[3]  Kevin Duh,et al.  HPSG-Based Preprocessing for English-to-Japanese Translation , 2012, TALIP.

[4]  Kevin Knight,et al.  Automated Postediting of Documents , 1994, AAAI.

[5]  David Chiang,et al.  Forest Rescoring: Faster Decoding with Integrated Language Models , 2007, ACL.

[6]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[7]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[8]  Giorgio Satta,et al.  Efficient Parsing for Bilexical Context-Free Grammars and Head Automaton Grammars , 1999, ACL.

[9]  Kevin Duh,et al.  Post-ordering in Statistical Machine Translation , 2011, MTSUMMIT.

[10]  Jason Eisner,et al.  Learning Linear Ordering Problems for Better Translation , 2009, EMNLP.

[11]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[12]  Stephen Clark,et al.  Shift-Reduce CCG Parsing , 2011, ACL.

[13]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[14]  John Langford,et al.  Hash Kernels for Structured Data , 2009, J. Mach. Learn. Res..

[15]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[16]  Francis Bond,et al.  Memory-Based Learning for Article Generation , 2000, CoNLL/LLL.

[17]  Jun'ichi Tsujii,et al.  Feature Forest Models for Probabilistic HPSG Parsing , 2008, CL.

[18]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[19]  Kevin Duh,et al.  Automatic Evaluation of Translation Quality for Distant Language Pairs , 2010, EMNLP.

[20]  Eiichiro Sumita,et al.  Overview of the Patent Machine Translation Task at the NTCIR-10 Workshop , 2011, NTCIR.

[21]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[22]  Masao Utiyama,et al.  Post-ordering by Parsing for Japanese-English Statistical Machine Translation , 2012, ACL.

[23]  Eugene Charniak,et al.  Language Modeling for Determiner Selection , 2007, NAACL.

[24]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[25]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[26]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[27]  Haizhou Li,et al.  Joint Models for Chinese POS Tagging and Dependency Parsing , 2011, EMNLP.

[28]  Dan Klein,et al.  Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[29]  Kevin Knight,et al.  Training Tree Transducers , 2004, NAACL.

[30]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[31]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[32]  Andreas Stolcke,et al.  SRILM at Sixteen: Update and Outlook , 2011 .

[33]  Alon Lavie,et al.  A Best-First Probabilistic Shift-Reduce Parser , 2006, ACL.

[34]  Kenji Sagae,et al.  Dynamic Programming for Linear-Time Incremental Parsing , 2010, ACL.

[35]  Brian Roark,et al.  Incremental Parsing with the Perceptron Algorithm , 2004, ACL.