A deep source-context feature for lexical selection in statistical machine translation

Introduction of source-context deep features into a standard phrase-based statistical machine translation system.Computation of sentence-similarity by means of auto-encoders.Effective low-dimensional embedding of data. Display Omitted This paper presents a methodology to address lexical disambiguation in a standard phrase-based statistical machine translation system. Similarity among source contexts is used to select appropriate translation units. The information is introduced as a novel feature of the phrase-based model and it is used to select the translation units extracted from the training sentence more similar to the sentence to translate. The similarity is computed through a deep autoencoder representation, which allows to obtain effective low-dimensional embedding of data and statistically significant BLEU score improvements on two different tasks (English-to-Spanish and English-to-Hindi).

[1]  Geoffrey E. Hinton,et al.  Replicated Softmax: an Undirected Topic Model , 2009, NIPS.

[2]  Hermann Ney,et al.  Translation Modeling with Bidirectional Recurrent Neural Networks , 2014, EMNLP.

[3]  John C. Platt,et al.  Translingual Document Representations from Discriminative Projections , 2010, EMNLP.

[4]  Bo Xu,et al.  Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machine Translation , 2014, ACL.

[5]  José A. R. Fonollosa,et al.  Smooth Bilingual N-Gram Translation , 2007, EMNLP.

[6]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[7]  Peter Glöckner,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2013 .

[8]  Marine Carpuat,et al.  Word Sense Disambiguation vs. Statistical Machine Translation , 2005, ACL.

[9]  Marta R. Costa-jussà,et al.  A Semantic Feature for Statistical Machine Translation , 2011, SSST@ACL.

[10]  Alexandre Allauzen,et al.  Continuous Space Translation Models with Neural Networks , 2012, NAACL.

[11]  H. Bourlard,et al.  Auto-association by multilayer perceptrons and singular value decomposition , 1988, Biological Cybernetics.

[12]  Eric P. Xing,et al.  Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2014, ACL 2014.

[13]  Nitish Srivastava,et al.  Modeling Documents with Deep Boltzmann Machines , 2013, UAI.

[14]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[15]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[16]  Philipp Koehn,et al.  Dynamic Topic Adaptation for Phrase-based MT , 2014, EACL.

[17]  Parth Gupta,et al.  Query expansion for mixed-script information retrieval , 2014, SIGIR.

[18]  Holger Schwenk,et al.  N-gram-based machine translation enhanced with neural networks , 2010, IWSLT.

[19]  Yang Liu,et al.  Recursive Autoencoders for ITG-Based Translation , 2013, EMNLP.

[20]  Ondrej Bojar,et al.  HindEnCorp - Hindi-English and Hindi-only Corpus for Machine Translation , 2014, LREC.

[21]  John C. Platt,et al.  Learning Discriminative Projections for Text Similarity Measures , 2011, CoNLL.

[22]  S. T. Dumais,et al.  Using latent semantic analysis to improve access to textual information , 1988, CHI '88.

[23]  Roland Kuhn,et al.  Vector Space Model for Adaptation in Statistical Machine Translation , 2013, ACL.

[24]  Rejwanul Haque,et al.  Integrating Source-Language Context into Log-Linear Models of Statistical Machine Translation , 2011 .

[25]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[26]  Ashish Vaswani,et al.  Decoding with Large-Scale Neural Language Models Improves Translation , 2013, EMNLP.

[27]  Marta R. Costa-jussà,et al.  Continuous space language models for the IWSLT 2006 task , 2006, IWSLT.

[28]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[29]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[30]  Ellen Riloff,et al.  Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 2012, HLT-NAACL 2012.

[31]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[32]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[33]  Lemao Liu,et al.  Additive Neural Networks for Statistical Machine Translation , 2013, ACL.

[34]  Marta R. Costa-jussà,et al.  English-to-Hindi system description for WMT 2014: Deep Source-Context Features for Moses , 2014, WMT@ACL.

[35]  Lluís Màrquez i Villodre,et al.  Discriminative Phrase-Based Models for Arabic Machine Translation , 2009, TALIP.