On the use of different loss functions in statistical pattern recognition applied to machine translation

In pattern recognition, an elegant and powerful way to deal with classification problems is based on the minimisation of the classification risk. The risk function is defined in terms of loss functions that measure the penalty for wrong decisions. However, in practice a trivial loss function is usually adopted (the so-called 0-1 loss function) that do no make the most of this framework. This work is focused on the study of different loss functions, and specially on those loss functions that do not depend on the class proposed by the system. Loss functions of this kind have allowed us to theoretically explain heuristics that are successfully used with very complex pattern recognition problem, such as (statistical) machine translation. A comparative experimental work has also been carried out to compare different proposals of loss functions in the practical scenario of machine translation.

[1]  Francisco Casacuberta,et al.  Search algorithms for statistical machine translation based on dynamic programming and pruning techniques , 2001 .

[2]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[3]  Hermann Ney,et al.  Algorithms for statistical translation of spoken language , 2000, IEEE Trans. Speech Audio Process..

[4]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[5]  Francisco Casacuberta,et al.  MONOTONE STATISTICAL TRANSLATION USING WORD GROUPS , 2001 .

[6]  Hemanta K. Maji,et al.  Computational Complexity of Statistical Machine Translation , 2006, EACL.

[7]  Hermann Ney,et al.  Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation , 2003, CL.

[8]  Marti A. Hearst,et al.  HLT-NAACL 2003 : Human Language Technology conference of the North American Chapter of the Association for Computational Linguistics : proceedings of the main conference : May 27 to June 1, 2003, Edmonton, Alberta, Canada , 2003 .

[9]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[10]  David G. Stork,et al.  Pattern Classification , 1973 .

[11]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[12]  Mauro Cettolo,et al.  ITC-irst at the 2006 TC-STAR SLT Evaluation Campaign , 2006 .

[13]  Hermann Ney,et al.  Bayes risk minimization using metric loss functions , 2005, INTERSPEECH.

[14]  Francisco Casacuberta,et al.  An Empirical Comparison of Stack-Based Decoding Algorithms for Statistical Machine Translation , 2003, IbPRIA.

[15]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[16]  Franz Josef Och,et al.  Statistical machine translation: from single word models to alignment templates , 2002 .

[17]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.

[18]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[19]  F. Jelinek Fast sequential decoding algorithm using a stack , 1969 .

[20]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[21]  Alexander H. Waibel,et al.  Decoding Algorithm in Statistical Machine Translation , 1997, ACL.

[22]  Richard Zens,et al.  The RWTH Machine Translation System , 2006 .

[23]  Louisa Sadler,et al.  Structural Non-Correspondence in Translation , 1991, EACL.

[24]  Hermann Ney,et al.  Bayes Decision Rules and Confidence Measures for Statistical Machine Translation , 2004, EsTAL.

[25]  R. Rosenfeld,et al.  Two decades of statistical language modeling: where do we go from here? , 2000, Proceedings of the IEEE.

[26]  Kevin Knight,et al.  Decoding Complexity in Word-Replacement Translation Models , 1999, Comput. Linguistics.

[27]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[28]  Daniel Marcu,et al.  Fast Decoding and Optimal Decoding for Machine Translation , 2001, ACL.