MoBiL: A Hybrid Feature Set for Automatic Human Translation Quality Assessment

In this paper we introduce MoBiL, a hybrid Monolingual, Bilingual and Language modelling feature set and feature selection and evaluation framework. The set includes translation quality indicators that can be utilized to automatically predict the quality of human translations in terms of content adequacy and language fluency. We compare MoBiL with the QuEst baseline set by using them in classifiers trained with support vector machine and relevance vector machine learning algorithms on the same data set. We also report an experiment on feature selection to opt for fewer but more informative features from MoBiL. Our experiments show that classifiers trained on our feature set perform consistently better in predicting both adequacy and fluency than the classifiers trained on the baseline feature set. MoBiL also performs well when used with both support vector machine and relevance vector machine algorithms.

[1]  Lucia Specia,et al.  QuEst – Design, Implementation and Extensions of a Framework for Machine Translation Quality Estimation , 2013, Prague Bull. Math. Linguistics.

[2]  Haizhou Li,et al.  Error Detection for Statistical Machine Translation Using Linguistic Features , 2010, ACL.

[3]  Wei-Yun Ma,et al.  Using a Supertagged Dependency Language Model to Select a Good Translation in System Combination , 2013, HLT-NAACL.

[4]  Philipp Koehn,et al.  Findings of the 2012 Workshop on Statistical Machine Translation , 2012, WMT@NAACL-HLT.

[5]  Heidi Fox,et al.  Phrasal Cohesion and Statistical Machine Translation , 2002, EMNLP.

[6]  何高大,et al.  人工智能在外语教学中的应用——谦评《Artificial Intelligence in Second Language Learning: Raising Error Awareness》 , 2008 .

[7]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[8]  I ScottKirkpatrick Optimization by Simulated Annealing: Quantitative Studies , 1984 .

[9]  Tian Ya A Preliminary Exploration into On-line Automated Assessment of Translation , 2008 .

[10]  Miguel Rios,et al.  Large Scale Translation Quality Estimation , 2015 .

[11]  Phil Blunsom,et al.  Multilingual Models for Compositional Distributed Semantics , 2014, ACL.

[12]  Andy Way,et al.  Dependency-Based Automatic Evaluation for Machine Translation , 2007, SSST@HLT-NAACL.

[13]  Juliane House,et al.  Translation Quality Assessment: Past and Present , 2014 .

[14]  Lucia Specia,et al.  Linguistic Features for Quality Estimation , 2012, WMT@NAACL-HLT.

[15]  Dean P. Foster,et al.  Multi-View Learning of Word Embeddings via CCA , 2011, NIPS.

[16]  Daniel Jurafsky,et al.  Measuring machine translation quality as semantic equivalence: A metric based on entailment features , 2009, Machine Translation.

[17]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[18]  Liang Tian,et al.  UM-Corpus: A Large English-Chinese Parallel Corpus for Statistical Machine Translation , 2014, LREC.

[19]  Andreas Eisele,et al.  MultiUN: A Multilingual Corpus from United Nation Documents , 2010, LREC.

[20]  Bogdan Babych,et al.  Sensitivity of Automated MT Evaluation Metrics on Higher Quality MT Output: BLEU vs Task-Based Evaluation Methods , 2008, LREC.

[21]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[22]  Eleftherios Avramidis,et al.  Quality estimation for Machine Translation output using linguistic analysis and decoding features , 2012, WMT@NAACL-HLT.

[23]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.