论文信息 - Phrasal: A Statistical Machine Translation Toolkit for Exploring New Model Features

Phrasal: A Statistical Machine Translation Toolkit for Exploring New Model Features

We present a new Java-based open source toolkit for phrase-based machine translation. The key innovation provided by the toolkit is to use APIs for integrating new features (/knowledge sources) into the decoding model and for extracting feature statistics from aligned bitexts. The package includes a number of useful features written to these APIs including features for hierarchical reordering, discriminatively trained linear distortion, and syntax based language models. Other useful utilities packaged with the toolkit include: a conditional phrase extraction system that builds a phrase table just for a specific dataset; and an implementation of MERT that allows for pluggable evaluation metrics for both training and evaluation with built in support for a variety of metrics (e.g., TERp, BLEU, METEOR).

Daniel Jurafsky | Christopher D. Manning | Daniel M. Cer | Michel Galley

[1] Christopher D. Manning,et al. Improved Models of Distortion Cost for Statistical Machine Translation , 2010, NAACL.

[2] Philipp Koehn,et al. Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[3] Nitin Madnani,et al. Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric , 2009, WMT@EACL.

[4] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[5] Ben Taskar,et al. Alignment by Agreement , 2006, NAACL.

[6] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[7] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[8] Hermann Ney,et al. Accelerated DP based search for statistical translation , 1997, EUROSPEECH.

[9] Daniel Jurafsky,et al. Discriminative Reordering with Chinese Grammatical Relations Features , 2009, SSST@HLT-NAACL.

[10] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[11] Hermann Ney,et al. An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research , 2000, LREC.

[12] Christopher D. Manning,et al. A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[13] Christopher D. Manning,et al. Quadratic-Time Dependency Parsing for Machine Translation , 2009, ACL.

[14] Alon Lavie,et al. The Meteor metric for automatic evaluation of machine translation , 2009, Machine Translation.

[15] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[16] Geoffrey Phipps. Comparing observed bug and productivity rates for Java and C++ , 1999 .