论文信息 - The NAIST machine translation system for IWSLT2012 - 字舞流文

The NAIST machine translation system for IWSLT2012

This paper describes the NAIST statistical machine translation system for the IWSLT2012 Evaluation Campaign. We participated in all TED Talk tasks, for a total of 11 languagepairs. For all tasks, we use the Moses phrase-based decoder and its experiment management system as a common base for building translation systems. The focus of our work is on performing a comprehensive comparison of a multitude of existing techniques for the TED task, exploring issues such as out-of-domain data filtering, minimum Bayes risk decoding, MERT vs. PRO tuning, word alignment combination, and morphology.

Tomoki Toda | Satoshi Nakamura | Kevin Duh | Graham Neubig | Takatomo Kano | Sakriani Sakti | Tetsuo Kiso | Masaya Ohgushi | Graham Neubig | Kevin Duh | S. Sakti | T. Toda | Satoshi Nakamura | Takatomo Kano | Tetsuo Kiso | Masaya Ohgushi

[1] Dragos Stefan Munteanu,et al. Improving Machine Translation Performance by Exploiting Non-Parallel Corpora , 2005, CL.

[2] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[3] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[4] Roland Kuhn,et al. Phrasetable Smoothing for Statistical Machine Translation , 2006, EMNLP.

[5] PietraVincent J. Della,et al. The mathematics of statistical machine translation , 1993 .

[6] Frederick Jelinek,et al. Interpolated estimation of Markov source parameters from sparse data , 1980 .

[7] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[8] Shankar Kumar,et al. Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.

[9] Shankar Kumar,et al. Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2008, EMNLP.

[10] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[11] Kevin Duh,et al. Analysis of translation model adaptation in statistical machine translation , 2010, IWSLT.

[12] Hermann Ney,et al. Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[13] Sebastian Stüker,et al. Overview of the IWSLT 2012 evaluation campaign , 2012, IWSLT.

[14] Chin-Yew Lin,et al. ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation , 2004, COLING.

[15] Jan Niehues,et al. The KIT English-French translation systems for IWSLT 2011 , 2011, IWSLT.

[16] Hermann Ney,et al. Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[17] Mathias Creutz,et al. Unsupervised Discovery of Morphemes , 2002, SIGMORPHON.

[18] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[19] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[20] Chris Quirk,et al. Random Restarts in Minimum Error Rate Training for Statistical Machine Translation , 2008, COLING.

[21] Mark Hopkins,et al. Tuning as Ranking , 2011, EMNLP.

[22] MarcuDaniel,et al. Improving Machine Translation Performance by Exploiting Non-Parallel Corpora , 2005 .

[23] Taro Watanabe,et al. Machine Translation without Words through Substring Alignment , 2012, ACL.

[24] Philipp Koehn,et al. Empirical Methods for Compound Splitting , 2003, EACL.