论文信息 - Word Confidence Estimation for SMT N-best List Re-ranking - 字舞流文

Word Confidence Estimation for SMT N-best List Re-ranking

This paper proposes to use Word Confidence Estimation (WCE) information to improve MT outputs via N-best list reranking. From the confidence label assigned for each word in the MT hypothesis, we add six scores to the baseline loglinear model in order to re-rank the N-best list. Firstly, the correlation between the WCE-based sentence-level scores and the conventional evaluation scores (BLEU, TER, TERp-A) is investigated. Then, the N-best list re-ranking is evaluated over different WCE system performance levels: from our real and efficient WCE system (ranked 1st during last WMT 2013 Quality Estimation Task) to an oracle WCE (which simulates an interactive scenario where a user simply validates words of a MT hypothesis and the new output will be automatically re-generated). The results suggest that our real WCE system slightly (but significantly) improves the baseline while the oracle one extremely boosts it; and better WCE leads to better MT quality.

Benjamin Lecouteux | Laurent Besacier | Ngoc-Quang Luong | L. Besacier | B. Lecouteux | N. Luong

[1] Taro Watanabe,et al. Online Large-Margin Training for Statistical Machine Translation , 2007, EMNLP.

[2] Ying Zhang,et al. Distributed Language Modeling for N-best List Re-ranking , 2006, EMNLP.

[3] Yaser Al-Onaizan,et al. Goodness: A Method for Measuring Machine Translation Confidence , 2011, ACL.

[4] Haizhou Li,et al. Error Detection for Statistical Machine Translation Using Linguistic Features , 2010, ACL.

[5] Kevin Duh,et al. Beyond Log-Linear Models: Boosted Minimum Error Rate Training for N-best Re-ranking , 2008, ACL.

[6] François Yvon,et al. Non-linear n-best List Reranking with Few Features , 2012, AMTA.

[7] Radu Soricut,et al. TrustRank: Inducing Trust in Automatic Translations via Ranking , 2010, ACL.

[8] François Yvon,et al. Practical Very Large Scale CRFs , 2010, ACL.

[9] Mei Yang,et al. Improved Language Modeling for Statistical Machine Translation , 2005, ParallelText@ACL.

[10] Benjamin Lecouteux,et al. Word Confidence Estimation and Its Integration in Sentence Quality Estimation for Machine Translation , 2013, KSE.

[11] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12] Hermann Ney,et al. Word-Level Confidence Estimation for Machine Translation using Phrase-Based Translation Models , 2005, HLT.

[13] Lucia Specia,et al. PET: a Tool for Post-editing and Assessing Machine Translation , 2012, LREC.

[14] Alon Lavie,et al. Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.

[15] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[16] Alex Kulesza,et al. Confidence Estimation for Machine Translation , 2004, COLING.

[17] Hermann Ney,et al. Confidence measures for statistical machine translation , 2003, MTSUMMIT.

[18] Preslav Nakov,et al. Optimizing for Sentence-Level BLEU+1 Yields Short Translations , 2012, COLING.

[19] Stephan Vogel,et al. An Efficient Two-Pass Approach to Synchronous-CFG Driven Statistical MT , 2007, NAACL.

[20] Benjamin Lecouteux,et al. LIG System for WMT13 QE Task: Investigating the Usefulness of Features in Word Confidence Estimation for MT , 2013, WMT@ACL.

[21] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[22] Matthew G. Snover,et al. TERp System Description , 2008 .

[23] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[24] Nizar Habash,et al. Can Automatic Post-Editing Make MT More Meaningful , 2012, EAMT.

[25] Lucia Specia,et al. Linguistic Features for Quality Estimation , 2012, WMT@NAACL-HLT.

[26] Hervé Blanchon,et al. Collection of a Large Database of French-English SMT Output Corrections , 2012, LREC.