论文信息 - Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers

Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers

In this paper, with a belief that a language model that embraces a larger context provides better prediction ability, we present two extensions to standard n-gram language models in statistical machine translation: a backward language model that augments the conventional forward language model, and a mutual information trigger model which captures long-distance dependencies that go beyond the scope of standard n-gram language models. We integrate the two proposed models into phrase-based statistical machine translation and conduct experiments on large-scale training data to investigate their effectiveness. Our experimental results show that both models are able to significantly improve translation quality and collectively achieve up to 1 BLEU point over a competitive baseline.

Haizhou Li | Deyi Xiong | Min Zhang

[1] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[2] Ying Zhang,et al. Distributed Language Modeling for N-best List Re-ranking , 2006, EMNLP.

[3] Eiichiro Sumita,et al. Bidirectional Phrase-based Statistical Machine Translation , 2009, EMNLP.

[4] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[5] Qun Liu,et al. Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation , 2006, ACL.

[6] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[7] Patrick Wambacq,et al. Confidence scoring based on backward language models , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8] Thorsten Brants,et al. Large Language Models in Machine Translation , 2007, EMNLP.

[9] Kamel Smaïli,et al. New Confidence Measures for Statistical Machine Translation , 2009, ICAART.

[10] Ronald Rosenfeld,et al. Adaptive Statistical Language Modeling; A Maximum Entropy Approach , 1994 .

[11] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[12] Philipp Koehn,et al. Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[13] Miles Osborne,et al. Randomised Language Modelling for Statistical Machine Translation , 2007, ACL.

[14] Dekai Wu,et al. Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[15] Joshua Goodman,et al. A bit of progress in language modeling , 2001, Comput. Speech Lang..

[16] Kenji Yamada,et al. Syntax-based language models for statistical machine translation , 2003, ACL 2003.

[17] Hermann Ney,et al. Extending Statistical Machine Translation with Discriminative and Trigger-Based Lexicon Models , 2009, EMNLP.

[18] Matt Post,et al. Syntax-based language models for statistical machine translation , 2010 .

[19] Guodong Zhou. Modeling of Long Distance Context Dependency , 2004, COLING.

[20] Zhou GuoDong. Modeling of long distance context dependency , 2004, COLING 2004.

[21] Jinxi Xu,et al. A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.

[22] Maria Leonor Pacheco,et al. of the Association for Computational Linguistics: , 2001 .

[23] Dekai Wu,et al. A Polynomial-Time Algorithm for Statistical Machine Translation , 1996, ACL.

[24] Ahmad Emami,et al. Large-Scale Distributed Language Modeling , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[25] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[26] Daniel Gildea. Parsers as language models for statistical machine translation , 2008 .

[27] David Chiang,et al. Hierarchical Phrase-Based Translation , 2007, CL.