Improving Statistical Machine Translation Using Word Sense Disambiguation

We show for the first time that incorporating the predictions of a word sense disambiguation system within a typical phrase-based statistical machine translation (SMT) model consistently improves translation quality across all three different IWSLT ChineseEnglish test sets, as well as producing statistically significant improvements on the larger NIST Chinese-English MT task— and moreover never hurts performance on any test set, according not only to BLEU but to all eight most commonly used automatic evaluation metrics. Recent work has challenged the assumption that word sense disambiguation (WSD) systems are useful for SMT. Yet SMT translation quality still obviously suffers from inaccurate lexical choice. In this paper, we address this problem by investigating a new strategy for integrating WSD into an SMT system, that performs fully phrasal multi-word disambiguation. Instead of directly incorporating a Senseval-style WSD system, we redefine the WSD task to match the exact same phrasal translation disambiguation task faced by phrase-based SMT systems. Our results provide the first known empirical evidence that lexical semantics are indeed useful for SMT, despite claims to the contrary.

[1]  Marine Carpuat,et al.  How phrase sense disambiguation outperforms word sense disambiguation for statistical machine translation , 2007, TMI.

[2]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[3]  Taro Watanabe,et al.  Reordering Constraints for Phrase-Based Statistical Machine Translation , 2004, COLING.

[4]  Marine Carpuat,et al.  Evaluating the Word Sense Disambiguation Performance of Statistical Machine Translation , 2005, IJCNLP.

[5]  Hermann Ney,et al.  Refined Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach , 2001, ACL.

[6]  Hwee Tou Ng,et al.  Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study , 2003, ACL.

[7]  Hang Li,et al.  Word Translation Disambiguation Using Bilingual Bootstrapping , 2004, Computational Linguistics.

[8]  Lucia Specia,et al.  A Hybrid Relational Approach for WSD – First Results , 2006, ACL.

[9]  David Yarowsky,et al.  Distinguishing systems and distinguishing senses: new evaluation methods for Word Sense Disambiguation , 1999, Natural Language Engineering.

[10]  D. Id,et al.  Evaluating sense disambiguation across diverse parameter spaces , 2002 .

[11]  Marine Carpuat,et al.  Word Sense Disambiguation vs. Statistical Machine Translation , 2005, ACL.

[12]  Dekai Wu,et al.  A Polynomial-Time Algorithm for Statistical Machine Translation , 1996, ACL.

[13]  Mona T. Diab Relieving the data Acquisition Bottleneck in Word Sense Disambiguation , 2004, ACL.

[14]  Dekai Wu,et al.  Machine Translation with a Stochastic Grammatical Channel , 1998, COLING-ACL.

[15]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[16]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[17]  E. T. Jaynes,et al.  Where do we Stand on Maximum Entropy , 1979 .

[18]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[19]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[20]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[21]  Hermann Ney,et al.  Efficient integration of maximum entropy lexicon models within the training of statistical alignment models , 2002, AMTA.

[22]  Colin Cherry,et al.  Inversion Transduction Grammar for Joint Phrasal Translation Modeling , 2007, SSST@HLT-NAACL.

[23]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[24]  Marine Carpuat,et al.  A Kernel PCA Method for Superior Word Sense Disambiguation , 2004, ACL.

[25]  Philip Resnik,et al.  Using WSD Techniques for Lexical Selection in Statistical Machine Translation , 2005 .

[26]  Alon Itai,et al.  Word Sense Disambiguation Using a Second Language Monolingual Corpus , 1994, CL.

[27]  Hermann Ney,et al.  Accelerated DP based search for statistical translation , 1997, EUROSPEECH.

[28]  Marine Carpuat,et al.  Toward integrating word sense and entity disambiguation into statistical machine translation , 2006, IWSLT.

[29]  Lucia Specia,et al.  Multilingual versus Monolingual WSD , 2006 .

[30]  Marine Carpuat,et al.  Augmenting ensemble classification for Word Sense Disambiguation with a kernel PCA model , 2004, ACL 2004.

[31]  Adam Kilgarriff,et al.  Framework and Results for English SENSEVAL , 2000, Comput. Humanit..

[32]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[33]  Daphne Koller,et al.  Word-Sense Disambiguation for Machine Translation , 2005, HLT.

[34]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[35]  Adam Kilgarriff,et al.  English Lexical Sample Task Description , 2001, *SEMEVAL.

[36]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[37]  Dan Klein,et al.  Conditional Structure versus Conditional Estimation in NLP Models , 2002, EMNLP.

[38]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[39]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[40]  Hermann Ney,et al.  CDER: Efficient MT Evaluation Using Block Movements , 2006, EACL.

[41]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[42]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[43]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[44]  Ted Pedersen,et al.  The Senseval-3 Multilingual English-­Hindi lexical sample task , 2004, SENSEVAL@ACL.

[45]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.