Learning to Translate with Multiple Objectives

We introduce an approach to optimize a machine translation (MT) system on multiple metrics simultaneously. Different metrics (e.g. BLEU, TER) focus on different aspects of translation quality; our multi-objective approach leverages these diverse aspects to improve overall quality. Our approach is based on the theory of Pareto Optimality. It is simple to implement on top of existing single-objective optimization methods (e.g. MERT, PRO) and outperforms ad hoc alternatives based on linear-combination of metrics. We also discuss the issue of metric tunability and show that our Pareto approach is more effective in incorporating new metrics from MT evaluation for MT optimization.

[1]  Sebastian Stüker,et al.  Overview of the IWSLT 2010 evaluation campaign , 2010, IWSLT.

[2]  Jasbir S. Arora,et al.  Survey of multi-objective optimization methods for engineering , 2004 .

[3]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[4]  Rebecca Hwa,et al.  A Re-examination of Machine Learning Approaches for Sentence-Level MT Evaluation , 2007, ACL.

[5]  Hermann Ney,et al.  Automatic Evaluation Measures for Statistical Machine Translation System Optimization , 2008, LREC.

[6]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[7]  Lluís Màrquez i Villodre,et al.  Heterogeneous Automatic MT Evaluation Through Non-Parametric Metric Combinations , 2008, IJCNLP.

[8]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[9]  Daniel Jurafsky,et al.  Measuring machine translation quality as semantic equivalence: A metric based on entailment features , 2009, Machine Translation.

[10]  Jarek Gryz,et al.  Algorithms and analyses for maximal vector computation , 2007, The VLDB Journal.

[11]  Jason Eisner,et al.  Learning Speed-Accuracy Tradeoffs in Nondeterministic Inference Algorithms , 2011 .

[12]  Valentin I. Spitkovsky,et al.  Lateen EM: Unsupervised Training with Multiple Objectives, Applied to Dependency Grammar Induction , 2011, EMNLP.

[13]  Eiichiro Sumita,et al.  Overview of the Patent Machine Translation Task at the NTCIR-10 Workshop , 2011, NTCIR.

[14]  Alexandra Birch,et al.  Metrics for MT evaluation: evaluating reordering , 2010, Machine Translation.

[15]  Wolfgang Macherey,et al.  Lattice-based Minimum Error Rate Training for Statistical Machine Translation , 2008, EMNLP.

[16]  Daniel Jurafsky,et al.  The Best Lexical Metric for Phrase-Based Statistical MT System Optimization , 2010, NAACL.

[17]  Hwee Tou Ng,et al.  Better Evaluation Metrics Lead to Better Machine Translation , 2011, EMNLP.

[18]  椹木 義一,et al.  Theory of multiobjective optimization , 1985 .

[19]  Philipp Koehn,et al.  Findings of the 2011 Workshop on Statistical Machine Translation , 2011, WMT@EMNLP.

[20]  Yifan He,et al.  Improving the Objective Function in Minimum Error Rate Training , 2009, MTSUMMIT.

[21]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[22]  Kevin Duh,et al.  Automatic Evaluation of Translation Quality for Distant Language Pairs , 2010, EMNLP.

[23]  Ding Liu,et al.  Source-Language Features and Maximum Correlation Training for Machine Translation Evaluation , 2007, NAACL.

[24]  H. T. Kung,et al.  On the Average Number of Maxima in a Set of Vectors and Applications , 1978, JACM.

[25]  Keith B. Hall,et al.  Training dependency parsers by jointly optimizing multiple objectives , 2011, EMNLP.

[26]  Mark Hopkins,et al.  Tuning as Ranking , 2011, EMNLP.

[27]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[28]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[29]  Andy Way,et al.  Labelled Dependencies in Machine Translation Evaluation , 2007, WMT@ACL.

[30]  Ben Taskar,et al.  An End-to-End Discriminative Approach to Machine Translation , 2006, ACL.

[31]  Kaisa Miettinen,et al.  Nonlinear multiobjective optimization , 1998, International series in operations research and management science.

[32]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[33]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[34]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[35]  Kevin Knight,et al.  11,001 New Features for Statistical Machine Translation , 2009, NAACL.

[36]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[37]  Deepak Agarwal,et al.  Click shaping to optimize multiple objectives , 2011, KDD.

[38]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.

[39]  Omar Zaidan,et al.  Z-MERT: A Fully Configurable Open Source Tool for Minimum Error Rate Training of Machine Translation Systems , 2009, Prague Bull. Math. Linguistics.