论文信息 - Incremental Topic-Based Translation Model Adaptation for Conversational Spoken Language Translation

Incremental Topic-Based Translation Model Adaptation for Conversational Spoken Language Translation

We describe a translation model adaptation approach for conversational spoken language translation (CSLT), which encourages the use of contextually appropriate translation options from relevant training conversations. Our approach employs a monolingual LDA topic model to derive a similarity measure between the test conversation and the set of training conversations, which is used to bias translation choices towards the current context. A significant novelty of our adaptation technique is its incremental nature; we continuously update the topic distribution on the evolving test conversation as new utterances become available. Thus, our approach is well-suited to the causal constraint of spoken conversations. On an English-to-Iraqi CSLT task, the proposed approach gives significant improvements over a baseline system as measured by BLEU, TER, and NIST. Interestingly, the incremental approach outperforms a non-incremental oracle that has up-front knowledge of the whole conversation.

[1] Spyridon Matsoukas,et al. Discriminative Corpus Weight Estimation for Machine Translation , 2009, EMNLP.

[2] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[3] Eric P. Xing,et al. BiTAM: Bilingual Topic AdMixture Models for Word Alignment , 2006, ACL.

[4] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[5] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6] Ralph Weischedel,et al. A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[7] Matthew G. Snover,et al. A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[8] Yu Zhang,et al. Statistical Machine Translation based on LDA , 2010, 2010 4th International Universal Communication Symposium.

[9] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[10] Eiichiro Sumita,et al. Dynamic Model Interpolation for Statistical Machine Translation , 2008, WMT@ACL.

[11] Philipp Koehn,et al. Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[12] George R. Doddington,et al. Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[13] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[14] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[15] Roland Kuhn,et al. Mixture-Model Adaptation for SMT , 2007, WMT@ACL.

[16] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[17] Vladimir Eidelman,et al. Topic Models for Dynamic Translation Model Adaptation , 2012, ACL.

[18] Tanja Schultz,et al. Bilingual LSA-based adaptation for statistical machine translation , 2007, Machine Translation.