论文信息 - An Open Source Toolkit for Word-level Confidence Estimation in Machine Translation - 字舞流文

An Open Source Toolkit for Word-level Confidence Estimation in Machine Translation

Recently, a growing need of Confidence Estimation (CE) for Statistical Machine Translation (SMT) systems in Computer Aided Translation (CAT), was observed. However, most of the CE toolkits are optimized for a single target language (mainly English) and, as far as we know, none of them are dedicated to this specific task and freely available. This paper presents an open-source toolkit for predicting the quality of words of a SMT output, whose novel contributions are (i) support for various target languages, (ii) handle a number of features of different types (system-based, lexical , syntactic and semantic). In addition, the toolkit also integrates a wide variety of Natural Language Processing or Machine Learning tools to pre-process data, extract features and estimate confidence at word-level. Features for Word-level Confidence Estimation (WCE) can be easily added / removed using a configuration file. We validate the toolkit by experimenting in the WCE evaluation framework of WMT shared task with two language pairs: French-English and English-Spanish. The toolkit is made available to the research community with ready-made scripts to launch full experiments on these language pairs, while achieving state-of-the-art and reproducible performances.

Benjamin Lecouteux | Laurent Besacier | Christophe Servan | Ngoc-Tien Le | Ngoc Quang Luong

[1] Hervé Blanchon,et al. Collection of a Large Database of French-English SMT Output Corrections , 2012, LREC.

[2] Haizhou Li,et al. Error Detection for Statistical Machine Translation Using Linguistic Features , 2010, ACL.

[3] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[4] Matthew G. Snover,et al. TERp System Description , 2008 .

[5] Lucia Specia,et al. Multi-level Translation Quality Prediction with QuEst++ , 2015, ACL.

[6] Joakim Nivre,et al. Benchmarking of Statistical Dependency Parsers for French , 2010, COLING.

[7] Christian Raymond,et al. Boosting bonsai trees for efficient features combination: application to speaker role identification , 2014, INTERSPEECH.

[8] Hermann Ney,et al. Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[9] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[10] Matteo Negri,et al. FBK-UPV-UEdin participation in the WMT14 Quality Estimation shared-task , 2014, WMT@ACL.

[11] Kamel Smaïli,et al. “This sentence is wrong.” Detecting errors in machine-translated sentences , 2011, Machine Translation.

[12] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[13] Helmut Schmid,et al. Improvements in Part-of-Speech Tagging with an Application to German , 1999 .

[14] Philipp Koehn,et al. Findings of the 2014 Workshop on Statistical Machine Translation , 2014, WMT@ACL.

[15] Ergun Biçici. Referential Translation Machines for Quality Estimation , 2013, WMT@ACL.

[16] Hermann Ney,et al. Confidence measures for statistical machine translation , 2003, MTSUMMIT.

[17] Mineichi Kudo,et al. Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[18] Yaser Al-Onaizan,et al. Goodness: A Method for Measuring Machine Translation Confidence , 2011, ACL.

[19] Philipp Koehn,et al. Findings of the 2013 Workshop on Statistical Machine Translation , 2013, WMT@ACL.

[20] Philipp Koehn,et al. Findings of the 2012 Workshop on Statistical Machine Translation , 2012, WMT@NAACL-HLT.

[21] Simone Paolo Ponzetto,et al. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[22] Lidia S. Chao,et al. Quality Estimation for Machine Translation Using the Joint Method of Evaluation Criteria and Statistical Modeling , 2013, WMT@ACL.

[23] François Yvon,et al. Practical Very Large Scale CRFs , 2010, ACL.

[24] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[25] Ewan Klein,et al. Natural Language Processing with Python , 2009 .

[26] Hervé Blanchon,et al. The LIG Machine Translation System for WMT 2010 , 2010, WMT@ACL.

[27] Benjamin Lecouteux,et al. Towards accurate predictors of word quality for Machine Translation: Lessons learned on French-English and English-Spanish systems , 2015, Data Knowl. Eng..

[28] Benjamin Lecouteux,et al. LIG System for Word Level QE task at WMT14 , 2014, WMT@ACL.

[29] Alex Kulesza,et al. Confidence Estimation for Machine Translation , 2004, COLING.

[30] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[31] Kamel Smaïli,et al. LORIA System for the WMT15 Quality Estimation Shared Task , 2015, WMT@EMNLP.

[32] Alexandre Allauzen,et al. LIMSI Submission for WMT'14 QE Task , 2014, WMT@ACL.

[33] Dan Klein,et al. Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[34] Lucia Specia,et al. Linguistic Features for Quality Estimation , 2012, WMT@NAACL-HLT.