论文信息 - Using Paraphrases for Parameter Tuning in Statistical Machine Translation - 字舞流文

Using Paraphrases for Parameter Tuning in Statistical Machine Translation

Most state-of-the-art statistical machine translation systems use log-linear models, which are defined in terms of hypothesis features and weights for those features. It is standard to tune the feature weights in order to maximize a translation quality metric, using held-out test sentences and their corresponding reference translations. However, obtaining reference translations is expensive. In this paper, we introduce a new full-sentence paraphrase technique, based on English-to-English decoding with an MT system, and we demonstrate that the resulting paraphrases can be used to drastically reduce the number of human reference translations needed for parameter tuning, without a significant decrease in translation quality.

Nitin Madnani | Philip Resnik | Necip Fazil Ayan | Bonnie J. Dorr | B. Dorr | Nitin Madnani | P. Resnik | N. F. Ayan

[1] Hermann Ney,et al. Improved Statistical Alignment Models , 2000, ACL.

[2] Dekang Lin,et al. DIRT – Discovery of Inference Rules from Text , 2001 .

[3] Patrick Pantel,et al. DIRT @SBT@discovery of inference rules from text , 2001, KDD '01.

[4] Hermann Ney,et al. Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[5] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6] Regina Barzilay,et al. Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment , 2003, NAACL.

[7] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[8] Jimmy J. Lin,et al. Extracting Structural Paraphrases from Aligned Monolingual Corpora , 2003, IWP@ACL.

[9] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[10] Daniel Marcu,et al. Syntax-based Alignment of Multiple Translations: Extracting Paraphrases and Generating New Sentences , 2003, NAACL.

[11] Douglas W. Oard,et al. The surprise language exercises , 2003, TALIP.

[12] Chris Quirk,et al. Monolingual Machine Translation for Paraphrase Generation , 2004, EMNLP.

[13] Hermann Ney,et al. The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[14] Philipp Koehn,et al. Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[15] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[16] Chris Callison-Burch,et al. Paraphrasing with Bilingual Parallel Corpora , 2005, ACL.

[17] Alon Lavie,et al. MT for Minority Languages Using Elicitation-Based Learning of Syntactic Transfer Rules , 2002, Machine Translation.

[18] Ralph Weischedel,et al. A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[19] Jimmy J. Lin,et al. A Paraphrase-Based Approach to Machine Translation Evaluation , 2005 .

[20] Nicola Ueffing,et al. Using monolingual source-language data to improve MT performance , 2006, IWSLT.

[21] Dragos Stefan Munteanu,et al. ParaEval: Using Paraphrases to Evaluate Summaries Automatically , 2006, NAACL.

[22] Matthew G. Snover,et al. A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[23] Tong Zhang,et al. A Discriminative Global Training Algorithm for Statistical MT , 2006, ACL.

[24] Philipp Koehn,et al. Improved Statistical Machine Translation Using Paraphrases , 2006, NAACL.

[25] Mark Liberman,et al. Integrated Linguistic Resources for Language Exploitation Technologies , 2006, LREC.

[26] David Chiang,et al. Hierarchical Phrase-Based Translation , 2007, CL.

[27] Adam Lopez,et al. Statistical machine translation , 2007, CSUR.